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METHODS AND COMPOSITIONS FOR DISEASE DIAGNOSIS 

FIELD OF THE INVENTION 

The invention relates to methods and compositions for risk assessment, 
5 identification, diagnosis, prognosis, and/or monitoring of disease, and for early 
therapeutic intervention. 

BACKGROUND OF THE INVENTION 

It is axiomatic that early diagnosis and concomitant early therapeutic intervention 
1 0 is the key to successful treatment and/or management of most human disorders. 

However, many disorders cannot be diagnosed until the pathological process is already 
advanced. For example, many solid tumors are usually not clinically detectable before 
they can be palpated or visualized by tissue imaging techniques (i.e., when they are at 
least 0.5 cm in size), at which time neoplasia may have been present for years. Similarly, 
1 5 the diagnostic criterion for diabetes mellitus (increased fasting plasma glucose levels or 
hyperglycemia) identifies the disorder when glucose intolerance (the underlying cause of 
hyperglycemia) is already present. In another example, rheumatoid arthritis (RA) is 
diagnosed by the presence of joint stiffiiess and soreness and the presence of positive 
rheumatoid factor, all factors that indicate RA is already present and may be advanced. 

20 

Diagnostic Disease Markers 

In cancer, progression from preneoplasia to malignancy is accompanied by the 
accumulation of genetic changes in the neoplastic cells that lead to histopathologic^ 
modifications. In some circumstances, when such a genetic change corresponds to an 

25 increase in a protein made by the tumor cells, such a protein can be detected in the tumor 
or in body fluids (if secreted from the tumor), and used as a biological tumor marker. 
Most tumors have been associated with one or more such tumor markers. Such markers 
have been evaluated as potential tools to diagnose cancer, determine prognosis, and/or 
monitor cancer progression. However, many tumor markers are detectable only after 

30 neoplasia has already progressed to the stage of formation of a tumor. In some cases, a 
tumor marker may not be detectable until a tumor is already malignant. Thus, many of 
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the most widely used tumor markers are used primarily to monitor disease progression or 
response to treatment rather than for early diagnosis. 

In rheumatoid arthritis, anti-cyclic citrullinated peptide (anti-CCP) antibodies, 
anti-keratin antibodies (AKA) and IgM rheumatoid factors have been suggested as 
markers for rheumatoid arthritis (Bas et al., Rheumatology (Oxford), 2002, 41(7):809- 
14). However, the value of such markers remains inconclusive (Scott, Rheumatology 
(Oxford), 2000, 39(Supp) 1:24-9). Similarly, while several protein and gene markers 
have been found to correlate with the presence of active diabetes, the use of markers as 
diagnostic or predictive has not been proven valuable at this time for either type I or 
type 2 diabetes (see the National Academy of Clinical Biochemistry (NACB) Laboratory 
Medicine Practice Guidelines: Guidelines an d Recommendations for Laboratory Analysis 
in the Diagnosis and Management of Dia betes Mellirus. 2002, available online at the 
NACB web site). 



15 Genomics a nd Proteomics Tools for Disease Diagnosis 

The development of high throughput screening approaches such as functional 
genomics and proteomics has provided a new biological platform to search for molecules 
associated with different disorders. Gene-expression profiles based on microarray 
analysis have been of some use to predict survival of patients with lung carcinoma (Beer 

20 et al., 2002, Nat. Med., 8(8):816-24). A similar approach identified a group of genes that 
were said to be useful to predict the clinical outcome of diffuse large B-cell lymphoma 
following combination chemotherapy (Shipp et al., 2002, Nat. Med., 8(l):68-74). In 
addition, comparison of the proteomic profile of patients with ovary or prostate cancer 
compared to non cancerous volunteers was said to have provided a set of serum proteins 

25 that might be useful for early cancer detection (Petricoin et al., 2002, Lancet, 2002, 
359(9306):572-7; Petricoin et al., 2002, J. Natl. Cancer Inst, 94(20): 1576-8). 

At present, most functional genomics studies in cancer have used cancer samples 
obtained from patients to generated cancer-associated gene expression profiles (either by 
a genomics or a proteomics approach). 
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A need remains for methods to detect and diagnose disease. Particularly needed 
are predictive methods and markers for early stage or very early stage disease detection 
and risk assessment. 

5 SUMMARY OF THE INVENTION 

The methods described herein are based, at least in part, on the discovery that the 
central nervous system (CNS) exhibits specific changes in gene expression (e.g., changes 
in patterns of gene expression) in response to the presence of a peripheral (non-CNS) 
disease or disorder (e.g., a hyperproliferative disorder such as a non-CNS tumor or 

1 0 cancer, an immunological disorder, an inflammatory disorder, a metabolic disorder, or a 
pathogenic infection). While not bound by any theory, the inventors believe that specific 
changes in gene expression in the CNS, e.g., in the brain, occur in response to the 
presence of peripheral disease at an early stage in the development of the disease, e.g., 
before the disorder is clinically detectable and/or before the subject is symptomatic. 

1 5 Thus, peripheral disorders can be diagnosed at an early stage and targeted for early 

therapeutic intervention by analyzing changes or patterns in gene expression in the CNS. 

Accordingly, in one aspect, the invention features a method of diagnosing a 
non-CNS disorder in a subject, such as a human. The non-CNS disorder can be, e.g., a 
hyperproliferative disorder, e.g., a non-CNS tumor or cancer; an immunological disorder, 

20 e.g., rheumatoid arthritis; an inflammatory or allergic disorder, e.g., asthma; a metabolic 
disorder, e.g., diabetes or obesity, or a pathogenic infection, e.g., a viral infection. The 
method includes detecting expression of a gene in a CNS sample of the subject, e.g., a 
brain tissue or cell (such as a tissue or cell of the hypothalamus, the cerebellum, the 
midbrain, the hippocampus, the prefrontal cortex or the striatum) or a sample of 

25 cerebrospinal fluid (CSF). The method optionally includes a step of obtaining the CNS 
sample. A change in gene expression compared to a reference value, e.g., a control or 
basal value, is correlated with the presence of a non-CNS disorder. The method is not 
limiting in that it can be used to detect the risk or presence of any non-CNS disorder. In 
one embodiment, the non-CNS disorder is not lymphoma. 
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The subject can be a human. In one embodiment, the human is not symptomatic 
for the disorder to be diagnosed. In another embodiment, the disorder is not clinically 
detectable, e.g., it is not detectable by a routine general clinical exam. 

Detecting expression of a gene in a CNS sample can include detecting or 

5 determining a value for one or more of: the level ofmRNA, rate of transcription, amount 
of a gene product, and activity of a gene product. In some embodiments, expression of a 
single gene in the CNS may be detected, where a change in gene expression in that gene 
is associated with the presence of a non-CNS disorder. In other embodiments, expression 
of a plurality of genes (e.g., a panel or cluster of genes) may be evaluated, where a 

1 0 specific profile of gene expression of the plurality of genes is associated with the 
presence of a particular non-CNS disorder. 

The method can include correlating the result of the detecting step to the presence 
or absence of a non-CNS disorder. "Correlating" means identifying the probability, 
based on the result of the detecting step, that the subject has or does not have a non-CNS 

1 5 disorder. Correlating can include generating a dataset from, or providing a record of, the 
detecting step, e.g., a printed or computer readable record such as a laboratory record or 
dataset. The record can include other information, such as a specific subject identifier, a 
sample identifier for the CNS sample, a date, the identity of the operator of the method, 
and/or other information. The record can be used to provide or store information about 

20 the subject. For example, the record can be used to provide information (e.g., to the 
subject, a health care provider, the government, or insurance company). The record or 
information derived from the record can be used, e.g., to identify the subject as suitable 
or unsuitable for a particular therapy or a particular clinical trial group. 

In the methods described herein, gene expression of a CNS gene can be detected 

25 by any technique available to the skilled artisan, e.g., genomics or proteomics microarray 
analysis of a CNS biological sample, such as brain tissue or CSF; or brain imaging 
techniques that detect changes in gene expression. In one embodiment, the method 
involves detecting a CNS gene product released or secreted into the CSF. In such 
embodiments, an agent (such as an antibody, e.g., a labeled antibody) for detecting the 

30 gene product can be immobilized on a solid phase, e.g., in a dipstick format. 
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The gene or genes to be evaluated will depend on the specific gene or profile of 
gene expression associated with a particular disorder. For example, exemplary genes (or 
profiles or clusters of genes) that are regulated in response to the presence of cancer cells 
(or particular types of cancer cells) are shown in FIGS. 2-26, infra. Such genes are also 
5 referred to herein as CNS "marker genes" or "disease surveillance genes" for non-CNS 
disorders. The exemplary CNS marker genes are not limiting, as the methods described 
herein can include the detection of any other genes or gene products determined to 
exhibit a change in expression associated with the presence of a peripheral disorder. 
CNS marker genes can include, inter alia, genes encoding hormones, growth factors, 

10 immune system components, and cytokines. 

In another aspect, the invention features a method of determining whether a 
subject (e.g., a human) has, or is at risk for developing, a peripheral (non-CNS) disorder. 
The method involves providing or obtaining a test gene expression profile for two or 
more CNS genes in the subject; and comparing the test gene expression profile with a 

1 5 reference gene expression profile (e. g., a reference gene expression profile described 

herein), wherein the reference gene expression profile is associated with the presence of a 
particular non-CNS disorder. Non-limiting examples of reference gene expression 
profiles (e.g., associated with colon, breast or lung carcinoma), are disclosed herein. In 
one embodiment, the method includes generating a record of the result (e.g., a laboratory 

20 record or dataset) of the comparing step; and, optionally, transmitting the record (e.g., by 
print or computer readable material) to the subject, the subject's health care provider or 
another party. As with other methods described herein, various techniques can be used to 
provide a gene expression profile and various types of disorder scan be detected. 

In another aspect, the invention features a method of treating a subject, e.g., a 

25 human, by diagnosing a peripheral (non-CNS) disorder in a subject, e.g., using a method 
described herein; and administering to the subject a therapeutic agent for the treatment of 
the disorder, e.g., a chemotherapeutic agent. Because the detection/diagnostic methods 
described herein can indicate the presence of peripheral disease at an early stage in the 
pathogenic process (e.g., before a disorder is symptomatic or clinically diagnosable), the 

30 methods allow for early intervention to control the disorder, e.g., implementing lifestyle 
changes to stop or slow further progress of the disease, or by administering a therapeutic 
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agent to slow or control the progression of the disease. Such agents can advantageously 
be used at lower dosages than are typically used after a disease is sufficiently advanced to 
be clinically diagnosed. 

In another aspect, the invention features a method of identifying a diagnostic 

5 marker gene for a peripheral (non-CNS) disorder in a subject. The method involves: 
inducing a non-CNS disorder in a test experimental animal, e.g., a rodent tumor model; 
comparing expression of a gene in a CNS tissue or cell in the test experimental animal to 
expression of the gene in a control experimental animal; and selecting as a diagnostic 
marker a gene (or human homolog of the gene) that is differentially expressed in the test 

1 0 experimental animal compared to the control experimental animal. 

The methods described herein are useful, inter alia, for risk assessment for a 
variety of disorders, for early detection and diagnosis of disease, for monitoring of 
progression of disease, for monitoring efficacy of treatment for a disease, and/or 
evaluation of clinical status. 

1 5 As used herein a "disorder" or "disease" is an alteration in the state of the body or 

of some of its cells, tissues, or organs, that threatens health. The two terms are meant to 
encompass all stages of an illness, including the very early stages of an illness (e.g., early 
alterations in the body that may not be detectable by the subject or a health care provider, 
but nonetheless set in motion a disease process). For example, the terms "disorder" and 

20 "disease" encompass the state of neoplasia, before a neoplasm or tumor is formed; early 
immunological reactions to an antigen, e.g., in the development of rheumatoid arthritis or 
asthma, before inflammation or allergy are symptomatic; and early changes in energy 
metabolism that promote weight gain, before weight gain is produced. 

As used herein, "neoplasia" is an unregulated and progressive proliferation of 

25 cells under conditions that would not elicit, or would cause cessation of, proliferation of 
normal cells. Neoplasia can result in the formation of a "neoplasm," a new and abnormal 
growth of tissue. If the abnormally proliferating cells form a mass, a neoplasm is 
generally referred to as a "tumor." A neoplasm may be benign or malignant (cancerous). 
As used herein, the term "matches" or "matching" when referring to a test gene 

30 expression profile and a reference gene expression profile, means that the profiles are 
sufficiently similar to each other to have an analogous cause or effect. Two profiles are 
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matching if they are at least 70% identical in reference to the number of genes having 
similar expression patterns in each profile, or the level of expression of the genes in each 
profile. In some embodiments, two profiles can be at least 80%, 85%, 90%, 95%, 98%, 
or more, identical. 

5 A "subject" is a human or animal that is tested for the presence of a possible 

disorder. The animal can be a mammal, e.g., a domesticated animal such as a dog, cat, 
horse, pig, cow or goat; an experimental animal such as an experimental rodent (e.g., a 
mouse, rat, guinea pig, or hamster); a rabbit; or an experimental primate, e.g., a 
chimpanzee or monkey. 

1 0 Although methods and materials similar or equivalent to those described herein 

can be used in the practice or testing of the present invention, suitable methods and 
materials are described below. All publications, patent applications, patents, and other 
references mentioned herein are incorporated by reference in their entirety. In case of 
conflict, the present specification, including definitions, will control. In addition, the 

15 materials, methods, and examples are illustrative only and not intended to be limiting. 
Other features and advantages of the invention will be apparent from the 
following detailed description, the drawings, and the claims. 

BRIEF DESCRIPTION OF THE DRAWINGS 

20 FIG. 1 is a schematic diagram of a slide used in a microarray gene expression 

assay. 

FIG. 2-(l-7) is a table showing the results of cluster analysis I for colon cancer in 
midbrain. This cluster analysis identified differentially expressed genes (p < 0.05) up- or 

25 down-regulated at one of two experimental time points after injection of cancer cells into 
the relevant animal model. Genes with similar expression pattern were clustered using 
hierarchical clustering techniques. The table lists 407 markers differentially expressed in 
the midbrain of mice injected with colon cancer cells at one of the two experimental time 
points studied for colon cancer (72 and 198 hours). Each listed marker gene belongs to 

30 one of 12 clusters, as indicated by the left hand column of the table. FIG. 2-8 is a set of 
cluster graphs that illustrates the relative differential expression of each cluster at each of 
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the time points tested. In each cluster diagram (in all the figures), the y-axis represents a 
value for the relative level of gene expression of the cluster compared to control (the 
midline at 0) and the x-axis represents time. In this case, 1 .0 = 72 hours and 2.0 = 198 
hours. Thus, expression values below the midline indicates the genes in that cluster are 
5 down-regulated at that time. Expression values above the midline indicates the genes in 
that cluster are down-regulated at that time. 

FIG. 3-1 is a table showing the results of cluster analysis II for colon cancer in 
midbrain. Cluster analysis II identified differentially expressed genes (p < 0.05) up- or 
10 down-regulated at both experimental time points tested (72 and 198 hours ) after injection 
of cancer cells into the relevant animal model. The table lists 41 markers (in 12 clusters) 
differentially expressed in the midbrain of mice injected with colon cancer cells at both 
72 and 198 hours. FIG. 3-2 is a set of cluster graphs that illustrates the relative 
differential expression of each cluster compared to control at each time point. 

15 

FIG. 4-(l-15) is a table showing the results of cluster analysis I for breast cancer 
in midbrain. The table lists 698 markers (in 12 clusters) differentially expressed in the 
midbrain of mice injected with breast cancer cells at one of the three experimental time 
points studied (18, 72 and 198 hours). FIG. 4-16 is a set of cluster graphs that illustrates 
20 the relative differential expression of each cluster compared to control (y-axis) at each of 
the three time points (1.0 = 18 hours, 2.0 =72 hours and 3.0 = 198 hours) (x-axis). 

FIG. 5-( 1 -6) is a table showing the results of cluster analysis II for breast cancer 
in midbrain. The table lists 299 markers (in 12 clusters) differentially expressed in the 
25 midbrain of mice injected with breast cancer cells at two or three of the experimental 
time points studied (18, 72 and 198 hours). FIG. 5-7 is a set of cluster graphs that 
illustrates the relative differential expression of each cluster compared to control (y-axis) 
at each of the three time points (1.0 = 18 hours, 2.0 =72 hours and 3.0 = 198 hours) (x- 
axis). 

30 
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FIG. 6-(l-14) is a table showing the results of cluster analysis I for lung cancer in 
midbrain. The table lists 797 markers (in 12 clusters) differentially expressed in the 
midbrain of mice injected with lung cancer cells at one of the three experimental time 
points studied. FIG. 6-15 is a set of cluster graphs that illustrates the relative differential 
5 expression of each cluster compared to control (y-axis) at each of the three time points 
(1.0 = 18 hours, 2.0 =72 hours and 3.0 = 198 hours) (x-axis). 

FIG. 7-(l-4) is a table showing the results of cluster analysis II for lung cancer in 
midbrain. The table lists 230 markers (in 12 clusters) expressed in the midbrain of mice 
1 0 injected with lung cancer cells at two or three of the three experimental time points 
studied. FIG. 7-5 is a set of cluster graphs that illustrates the relative differential 
expression of each cluster compared to control (y-axis) at each of the three time points 
(1 .0 = 18 hours, 2.0 =72 hours and 3.0 = 198 hours) (x-axis). 

15 FIG 8-(l-12) is a table showing the results of cluster analysis I for colon cancer in 

cortex. The table lists 688 markers (in 12 clusters) differentially expressed in the cortex 
of mice injected with colon cancer cells at one of two experimental time points studied. 
FIG 8-13 is a set of cluster graphs that illustrates the relative differential expression of 
each cluster compared to control (y-axis) at each of the two time points (1 .0 =72 hours 

20 and 2.0 = 198 hours) (x-axis). 

FIG 9-(l-2) is a table showing the results of cluster analysis II for colon cancer in 
cortex. The table lists 58 markers (in 12 clusters) differentially expressed in the cortex of 
mice injected with colon cancer cells at two or three of the three experimental time points 
25 studied. FIG 9-3 is a set of cluster graphs that illustrates the relative differential 

expression of each cluster compared to control (y-axis) at each of the three time points 
(1.0 = 18 hours, 2.0 =72 hours and 3.0 = 198 hours) (x-axis). 

FIG 10-(1-12) is a table showing the results of cluster analysis I for breast cancer 
30 in cortex. The table lists 744 markers (in 12 clusters) differentially expressed in the 
cortex of mice injected with breast cancer cells in one of the three experimental time 



9 



15 138-O03P01/ ER/CR-13. 1 50 



points studied. FIG 10-13 is a set of cluster graphs that illustrates the relative differential 
expression of each cluster compared to control (y-axis) at each of the three time points 
(1.0 = 18 hours, 2.0 =72 hours and 3.0 = 198 hours) (x-axis). 

5 FIG 1 1 -(1 -5) is a table showing the results of cluster analysis II for breast cancer 

in cortex. The table lists 272 markers (in 12 clusters) differentially expressed in the 
cortex of mice injected with breast cancer cells at two or three of the three experimental 
time points studied. FIG 11-6 is a set of cluster graphs that illustrates the relative 
differential expression of each cluster compared to control (y-axis) at each of the three 

10 time points (1.0 = 18 hours, 2.0 =72 hours and 3.0 = 198 hours) (x-axis). 

FIG12-(1-12) is a table showing the results of cluster analysis I for lung cancer in 
cortex. The table lists 828 markers (in 12 clusters) differentially expressed in the cortex 
of mice injected with lung cancer cells at one of the three experimental time points 
1 5 studied. FIG 1 2- 1 3 is a set of cluster graphs that illustrates the relative differential 
expression of each cluster compared to control (y-axis) at each of the three time points 
(1.0 = 18 hours, 2.0 =72 hours and 3.0 = 198 hours) (x-axis). 

FIG 13-(1 -5) is a table showing the results of cluster analysis II for lung cancer in 
20 cortex. The table lists 3 1 1 markers (in 12 clusters) differentially expressed in the cortex 
of mice injected with lung cancer cells at two or three of the three experimental time 
points studied. FIG 13-6 is a set of cluster graphs that illustrates the relative differential 
expression of each cluster compared to control (y-axis) at each of the three time points 
(1.0 = 18 hours, 2.0 =72 hours and 3.0 = 198 hours) (x-axis). 

25 

FIG 14-(l-7) is a table showing the results of cluster analysis I for colon cancer in 
striatum. The table lists 361 markers (in 12 clusters) differentially expressed in the 
striatum of mice injected with colon cancer cells at one of the two experimental time 
points studied. FIG 14-8 is a set of cluster graphs that illustrates the relative differential 
30 expression of each cluster compared to control (y-axis) at each of the two time points 
(1 .0 = 72 hours and 2.0 = 198 hours) (x-axis). 
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FIG 15-1 is a table showing the results of cluster analysis II for colon cancer in 
striatum. The table lists 40 markers (in 12 clusters) differentially expressed in the 
striatum of mice injected with colon cancer cells in both experimental time points 
5 studied. FIG 1 5-2 is a set of cluster graphs that illustrates the relative differential 
expression of each cluster compared to control (y-axis) at each of the two time points 
(1.0=72 hours and 2.0 = 198 hours) (x-axis). 

FIG 16-(l-8) is a table showing the results of cluster analysis I for lung cancer in 
10 striatum. The table lists 483 markers (in 12 clusters) differentially expressed in the 
striatum of mice injected with lung cancer cells at one of the three experimental time 
points studied. FIG 16-9 is a set of cluster graphs that illustrates the relative differential 
expression of each cluster compared to control (y-axis) at each of the three time points 
(1.0 = 18 hours, 2.0 =72 hours and 3.0 = 198 hours) (x-axis). 

15 

FIG 17-(l-4) is a table showing the results of cluster analysis II for lung cancer in 
striatum. The table lists 234 markers (in 12 clusters) differentially expressed in the 
striatum of mice injected with lung cancer cells at two or three of the three experimental 
time points studied. FIG 17-5 is a set of cluster graphs that illustrates the relative 
20 differential expression of each cluster compared to control (y-axis) at each of the three 
time points (1.0 = 18 hours, 2.0 =72 hours and 3.0 = 198 hours) (x-axis). 

FIG 18-(l-7) is a table showing the results of cluster analysis I for colon cancer in 
hypothalamus. The table lists 389 markers (in 12 clusters) differentially expressed in the 
25 hypothalamus of mice injected with colon cancer cells in one of the two experimental 
time points studied. FIG 18-8 is a set of cluster graphs that illustrates the relative 
differential expression of each cluster compared to control (y-axis) at each of the time 
points (1.0=72 hours and 2.0 = 198 hours) (x-axis). 

30 FIG 19-1 is a table showing the results of cluster analysis II for colon cancer in 

hypothalamus. The table lists 51 markers(in 12 clusters) differentially expressed in the 



11 



1 5 138-003P01 / ER/CR-13 . 1 50 



hypothalamus of mice injected with colon cancer cells in both experimental time points 
studied. FIG 19-2 is a set of cluster graphs that illustrates the relative differential 
expression of each cluster compared to control (y-axis) at each of the three time points 
(1.0 = 18 hours, 2.0 =72 hours and 3.0 = 198 hours) (x-axis). 

5 

FIG 20-(l-20) is a table showing the results of cluster analysis I for breast cancer 
in hypothalamus. The table lists 1252 markers (in 12 clusters) differentially expressed in 
the hypothalamus of mice injected with breast cancer cells at one of the three 
experimental time points studied. FIG 20-21 is a set of cluster graphs that illustrates the 
10 relative differential expression of each cluster compared to control (y-axis) at each of the 
three time points (1.0 = 18 hours, 2.0 =72 hours and 3.0 = 198 hours) (x-axis). 

FIG. 21-(l-6) is a table showing the results of cluster analysis II for breast cancer 
in hypothalamus. The table lists 366 markers (in 12 clusters) differentially expressed in 
1 5 the hypothalamus of mice injected with breast cancer cells at two or three of the three 
experimental time points studied. FIG 21-7 is a set of cluster graphs that illustrates the 
relative differential expression of each cluster compared to control (y-axis) at each of the 
three time points (1.0 = 18 hours, 2.0 =72 hours and 3.0 = 198 hours) (x-axis). 

20 FIG 22-(l -20) is a table showing the results of cluster analysis I for lung cancer in 

hypothalamus. The table lists 1 160 markers(in 12 clusters) differentially expressed in the 
hypothalamus of mice injected with lung cancer cells at one of the three experimental 
time points studied. FIG 22-21 is a set of cluster graphs that illustrates the relative 
differential expression of each cluster compared to control (y-axis) at each of the three 

25 time points (1.0=18 hours, 2.0 =72 hours and 3.0 = 1 98 hours) (x-axis). 

FIG. 23-(l-6) is a table showing the results of cluster analysis II for lung cancer in 
hypothalamus. The table lists 306 markers (in 12 clusters) differentially expressed in the 
hypothalamus of mice injected with lung cancer cells at two or three of the three 
30 experimental time points studied. FIG 23-7 is a set of cluster graphs that illustrates the 
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relative differential expression of each cluster compared to control (y-axis) at each of the 
three time points (1.0 = 18 hours, 2.0 =72 hours and 3.0 = 198 hours) (x-axis). 

FIG. 24(A)-(C) is a set of tables listing tumor-specific CNS markers differentially 
expressed, at any time tested, in three different cancer models: colon cancer, 24A; breast 
cancer, 24B; and lung cancer, 24C. Criteria for inclusion in this figure were (1) the 
marker corresponds to a secreted product; and (2) a p value below 0.05 for differential 
expression. 

FIG. 25(A)-(E) is a set of tables listing genes identified as CNS markers that are 
also potential targets for therapeutic intervention for each of colon, breast and lung 
cancer. Criteria for inclusion in this figure were (1) the marker corresponds to a signaling 
receptor such as a growth factor, hormone, or cytokine; and (2) a p value for differential 
expression below 0.05. 

FIG. 26 is a table listing CNS markers differentially expressed at least in one time 
point studied in all tumors analyzed. 

DETAILED DESCRIPTION 

The methods described herein rely, in part, on the detection of gene expression in 
the CNS to identify (e.g., diagnose or monitor) peripheral (non-CNS) tissues or organs 
for early stages of disease (e.g., in some cases, within hours, days, weeks or months of 
the appearance of disease). Early identification and/or diagnosis of disease provides an 
opportunity for early therapeutic intervention to target the disorder before it becomes 
overly advanced or aggressive. 

General Methodology 

The CNS is involved in the body's response to any internal or external stimulus 
that by its intensity or functional relevance could alter internal homeostasis. As part of 
this function, the CNS and the immune system interact to obtain a suitable immune 
response when necessary. 
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An immune response impacts the brain via neural and humoral mechanims. 
Neural mechanisms primarily involve the activation of the vagal nerve. Humoral 
mechanisms can include cytokine-mediated action directly on brain structures, e.g., 
cytokine-mediated increases on neural firing rates (Rothwell and Hopkins, 1995, Trends 

5 Neurosci 18(3):130-6; Wang et al., 2003, Nature, 421(6921):384-8). In one example, 
peripheral cytokines have been shown to bind and activate the vagal nerve, which in turn 
activates neurons of the nucleus of the tractus solitarius and the hypothalamus in the brain 
(Watkins and Maier, 1999, Proc. Natl. Acad. Sci. USA, 96(14):77l0-3). 

Humoral signals from the periphery act as potent messengers to the brain. 

1 0 Cytokines in the brain can exert their action at a much lower dose than in the periphery. 
For example, intracerebral administration of interleukin-1 (IL-1) at a dose of 100 pg to 
10 ng elicits maximal changes in fever, gastric function, increased metabolism and 
behavioral changes, while several micrograms of this cytokine are necessary to elicit 
similar responses when administered to the periphery (Rothwell and Hopkins, supra). 

1 5 After sensing an internal immune signal, the brain reacts in different ways. A 

paradigm of CNS response to immune signals is the activation of neuroendocrine axes 
such as the hypothalamus-pituitary-adrenal axis. The activation of this axis results in the 
liberation of glucocorticoids, which in turn can modulate the ongoing immune response 
in under 10 minutes. Vagatomy has been shown to blunt the activation of the HPA axis 

20 after intraperitoneal administration of cytokines (Watkins and Maier, supra). This 

feedback mechanism is of high physiological relevance; i.e., inhibition of glucocorticoid 
production after cytokine release in the periphery usually results in the death of the 
organism (Besedovsky and del Rey, 1996, Endocr. Rev., 17(1):64-102). 

The brain can also sense signals that will affect the immune and other systems 

25 from the external milieu. For example, the triggering of a stress reaction can result in the 
release of glucocorticoids and the attenuation of an ongoing immune response. The 
effects of stress on the immune system are well documented in animal models and 
humans (Deinzer et al., 2000, Int. J. Psychophysiol., 37(3):219-32; Marshall et al., 1998, 
Brain Behav. Immun., 12(4):297-307; Benschop et al., 1996, FASEB J., 10(4):5 17-24; 

30 Sheridan et al., 1998, Ann. N.Y. Acad. Sci., 840:803-8). In addition, there is anecdotal 
and preliminary evidence that mind/body interventions such as meditation or yoga could 
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have an influence on the immune system (Cassileth, 1999, CA Cancer J. Clin., 49(6):362- 
75). 

The new methods harness this natural reaction of the CNS as a way to detect 
peripheral disease at an early stage. 

5 

Cancer Develop ment 

It is generally accepted that a clinically detectable tumor mass is composed of 
cells that, although abnormal, evade immune surveillance and resist immune system 
attack. During the time of neoplastic progression, cells are characterized by high 

10 mutation rates, reflected, inter alia, in phenotypic changes such as downregulation of 
histocompatibility antigens. A tumor may thus become resistant to a particular 
therapeutic by clonal selection and proliferation from the tumor mass of a cell clone 
having a mutation that allows the cell to resist the given therapeutic. The "natural 
selection" of tumor cell clones occurs at a given rate leading to the appearance of 

15 malignant cells having genetic and epigenetic traits that facilitate growth and escape from 
the immune system. It is estimated that the average malignancy contains more than 
10,000 mutations (Staler et al., 1999, Proc. Natl. Acad. Sci., USA., 96(26): 15 121-6). 
Therefore, it can be concluded that the antigen profile of established cancers by no means 
reflects the cell genotype and phenotype of very early stage neoplasia. Moreover, it is 

20 reasonable to assume that tumor antigens present in the established cancer and the 
response they can induce in the organism will be different than the antigens and 
responses induced by early stage neoplastic cells. The new methods can detect such early 
stage neoplastic cells in spite of these obstacles. 

Some neoplasms, e.g., some cancers (e.g., certain types of carcinoma) can grow 

25 for long periods (e.g., for 1, 2, 5, 10, 15, 20 or 25 years) before they are clinically 

detectable using prior known technology and/or before they become malignant. This 
period provides an extraordinary window of opportunity for detection of cancerous cells 
before the malignant tumor is clinically detectable by current strategies. During this 
period tumor cells undergo several modifications at the molecular level as a result of their 

30 genomic instability. 
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Each genetic change is potentially selective for proliferation and/or is capable of 
triggering a new "alarm signal" to recruit and activate local innate and adaptive immune 
responses. In a simple view, 10,000 alarm signals might be produced during the 10 to 15 
years of tumor development before the tumor is clinically detectable. 

5 

Development of Rheumatoid Arthritis 

Rheumatoid arthritis (RA) is an acquired autoimmune disease in which genetic 
factors appear to play a role. RA occurs in 1-2 percent of the general population and is 
found world-wide. Females with RA outnumber males by 3 : 1 . Onset of the disease in 

1 0 adults is usually between the ages of 40 to 60 years, although it can occur at any age. 

RA involves Thl lymphocytes and macrophage infiltration into joints as well as 
the presence of rheumatoid factors in patients' serum (Chernajovsky et al., 2000, Genes 
Immun., 1 :295-307). Degradation of cartilage is accompanied by the outgrowth of 
synovial membrane (pannus). This process is generally regulated by IL-1 and TNF-a, 

15 while TGF-p and IL-10 counteract this effect (Chernajovsky et al., ibid). Susceptibility 
to arthritis has been correlated with MHC class II locus, in particular HLA-DR4 in 70 
percent of patients with RA (Chernajovsky et al., ibid). Rheumatoid Factor(s) (RF) are 
antibodies to IgG, and are present in 60-80 percent of adults with the disease. High titers 
of RF are usually associated with more severe and active joint disease, greater systemic 

20 involvement, and a poorer prognosis for remission. 

An unknown antigen is thought to initiate the autoimmune response resulting in 
RA. It has been suggested that there is a synovial antigen resembling a bacterial 
lipopolysaccharide (LPS) of arthritogenic bacteria that initiates the autoimmune response 
(Kennedy, 2000, Med. Hypotheses, 54(5):723-5). TNF-a appears to be the driving force 

25 behind the chronic inflammation characteristic of RA. TNF-a plays also an important 
role in B cell maturation which appears to participate in disease progression 
(Chernajovsky et al., ibid). Some data also strongly indicate a role for Suppressor of 
Cytokine signaling (SOCS) in disease outcome (Egan et al., 2003, J. Clin. Invest. 
lll(6):915-24). 
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The initiation of the autoimmune response and/or the initiation of the 
inflammatory mechanisms in the early development of RA are likely to trigger signals 
detected by changes in gene expression in the CNS. 

5 Development of Asthma 

Asthma is an inflammatory airway disease characterized by the presence of cells 

such as eosinophils, mast cells, basophils, and CD25+ T lymphocytes in the airway walls. 

Chemokines attract cells to the site of inflammation and cytokines (Interleukin (IL)-4, IL- 

5, IL- 10 and IL-13) activate them, resulting in inflammation and damage to the mucosa. 
1 0 When asthma becomes chronic, secondary changes occur, such as thickening of basement 

membrane and fibrosis. IL-4 and other cytokines such as TGF-0 may be involved in 

tissue remodeling and the fibrotic response. 

In allergic asthma (also known as extrinsic asthma), the initiation event of airway 

inflammation is an immunological reaction to allergen. Continued exposure to allergen 
1 5 results in chronic inflammation. Allergic asthma affects about 3 million children (8 to 12 

percent of all children) and 7 million adults in the United States at a cost estimated at 

$6.2 billion a year. It has been suggested that longitudinal studies based on yet 

unidentified inflammatory markers will guide asthma management in the future (Wilson, 

2002, Curr. Opin. Pulm. Med., 8(l):25-32). 
20 In the development of asthma, the initiation of the allergic or inflammatory 

response, e.g., release of cytokines and/or chemokines, can likely trigger signals detected 

by changes in gene expression in the CNS. 

Development of Obesity 

25 Body size and body weight are highly heritable traits. Association studies 

performed with populations of monozygotic and dizygotic twins, non-twin siblings and 
adoptive family members indicated that the variance for body mass index (body weight 
divided by height to the square) is much lower in identical twins that in any other group, 
indicating that genetic factors rather than environmental effects are the key determinant 

30 of human adiposity (Maes et al., 1997, Behav. Genet, 27:325-351; Allison et al., 1996, 
Int. J. Obes. Relat. Metab. Disord. 20:501-506). Diet-induced obesity is also highly 
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heritable. A pioneer study performed in 12 pairs of young adult identical twins overfed 
by 1,000 kcal per day during a 100-day period demonstrated that overfeeding induced a 
variable increase in body weight in all volunteers. However, twin pahs had six times less 
variance in mass increase than non-twin pairs, indicating that adaptation to long-term 

5 overfeeding has important genetic factors (Bouchard et al., 1 990, N. Engl. J. Med. 
322:1477-1482). The strong genetic predisposition to gain weight after ingesting a fat- 
rich diet is even more clearly observed in the laboratory when testing mice or rats of 
different genetic backgrounds (Schaffhauser et al., 2002, Obes. Res. 10:1 188-1196). 
Most strains of mice maintain their body weight throughout relatively long periods of 

10 time while being fed ad libitum with low fat diets. However, when fed ad libitum with a 
high fat diet, some strains develop a considerable increase in body mass and some other 
strains are resistant to this increase regardless of increase in food consumption (West et 
al., 1995, Am. J. Physiol., 268:R658-R665; Prpic et al., 2003, Endocrinology, 144:1155- 
1163). 

1 5 The regulation of body weight involves a large number of interconnected 

peripheral and brain circuits that participate in the control of energy balance throughout 
the entire organism (Spiegelman and Flier, 2001, Cell, 104:531-43). Information about 
the amount of energy stored in the whole body is transported into the brain by peripheral 
hormones such as leptin and insulin. The relative variation of the plasma concentration 

20 of these hormones is interpreted by central mechanisms to induce signals of appetite or 
satiety (Friedman and Halaas, 1998, Nature, 395:763-70). Other molecules such as 
ghrelin and cholecystokinin (CCK) enter into the brain after being released from different 
portions of the gastrointestinal tract and provide essential information to brain centers 
about the nutritional status of the organism (Murakami et al., 2002, J. Endocrinol., 

25 174:283-288; Sheng and Moran, 2002, Neuropeptides, 36:171-181). 

The hypothalamus, a critical brain area for the complicated control of energy 
homeostasis, integrates a variety of converging signals within a short time frame. In the 
ventral hypothalamus a group of appetite-inducing neurons expresses the neuropeptide Y 
(NPY) gene. As leptin levels drop from circulation NPY is released into the 

30 paraventricular nucleus of the hypothalamus to induce food intake (Widdowson et al., 
1999, Peptides, 20:367-372). A single intracerebroventricular administration of NPY in 
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mice or rats can dramatically increase food intake for many hours (Zarjevski et al., 1993, 
Endocrinology, 133:1753-1758). Conversely, another group of neurons located in the 
arcuate nucleus of the hypothalamus expresses the proopiomelanocortin gene (POMC). 
These neurons also express the leptin receptor gene. After an excessive intake of fat- 

5 enriched food, the levels of triglycerides rise, filling peripheral adipocytes with fat stores. 
This leads to an increase in production of leptin, which is released into the circulation and 
eventually enters the brain by a selective uptake mechanism (Hileman et al., 2002, 
Endocrinology, 143:775-783). Leptin stimulates leptin receptors located in POMC 
neurons, thereby increasing their firing activity (Cowley et al, 2001 , Nature, 41 1 :480- 

10 484). 

One of the active peptides produced by the POMC precursor is a-melanocyte 
stimulating hormone (a-MSH). Upon stimulation of leptin receptors, ot-MSH is released 
in the paraventricular nucleus of the hypothalamus to induce satiety. 
Intracerebro ventricular injections of a-MSH in mice or rats induce long lasting anorexia 

1 5 that can promote the death of the animals if they are not forced to feed (Fan et al., 1997, 
Nature, 385:165-168). 

The hormones, neuropeptides and their receptors described above are only a few 
examples of the many gene products that participate in the central control of energy 
balance. Regulation of a molecule involved in energy control (e.g., a disruption 

20 associated with propensity or presence of obesity) can likely trigger signals that result in 
changes in gene expression in the CNS. 

While not limited by any theory, the methods described herein are based, in part, 
on the discovery that the CNS senses the presence of "alarm signals" from peripheral 

25 (non-CNS) disorders at an early stage in the development of disease progression. Thus, 
the methods described herein relate to diagnosing peripheral disorders by detecting gene 
expression in the CNS, e.g., in a CNS sample from a subject, such as a human. In one 
aspect, a non-CNS disorder can be identified based on a profile of gene expression in the 
CNS (e.g., the brain) within hours, weeks or months after disease progression is initiated 

30 in the body. In some embodiments, a non-CNS disorder can be identified based on a 
profile of gene expression in the CNS (e.g., the brain) within one or more years (e.g., 2, 
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3, 5, 7, 10 or more years) after disease progression is initiated in the body, but before a 
disorder is clinically detectable and/or in an advanced stage. 

Methods Of Detecting Gene Expression 
5 Gene expression in the CNS can be detected in vitro, e.g., in an isolated CNS 

sample, or in vivo, e.g., using in vivo imaging techniques. 

Central Nervous System (CNS1 Samples 

The CNS refers to the brain (including the cranial nerves) and spinal cord. A CNS 
10 sample can be, e.g., a cell or tissue from the brain or spinal cord, or a sample of the 

cerebrospinal fluid (CSF) that fills the ventricles of the brain and the central canal of the 
spinal cord. 

Where the detection of gene expression is to be done in a CNS sample isolated 
from the subject, a CNS sample can be obtained by any number of methods available to 

15 the skilled artisan. For example, a CNS cell or tissue sample can be obtained from the 
brain, e.g., by needle biopsy or by open surgical incision. Imaging of the brain can be 
performed to determine the precise positioning of the needle or scalpel to enter the brain. 

In one example, known as stereotactic biopsy, a tiny hole is drilled into the skull 
with the patient under light sedation or general anesthesia, and a needle is inserted into 

20 the brain tissue guided by computer-assisted imaging techniques such as computerized 
tomography (CT) or magnetic resonance imaging (MRI) scans. The needle is used to 
remove a sample of cells, whose gene expression can then be detected by a routine assay, 
e.g., a gene expression assay described herein. In another example, a sample of CSF can 
be obtained by routine methods, such as by lumbar puncture. This procedure can be done 

25 on an outpatient basis, e.g., under local anesthetic. 

The number of cells or amount of CSF needed to perform a particular gene 
expression assay on a CNS sample will vary; however, some techniques, such as PCR 
based techniques, will require a very small number of cells, e.g., as few as 10 to 100 cells 
(Klein et al., Nat. Biotechnol., 20(4):387-92, 2002). The CNS sample can be used 

30 immediately in a diagnostic test described herein, or it can be stored, e.g., cooled or 
frozen, and/or transported to a facility where the diagnostic test is performed. 
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Nucleic Acid-Based Methods 

In one embodiment, the methods described herein will utilize techniques for 
detection of gene expression where a polynucleotide (such as an RNA, mRNA, DNA, 
5 cDNA, or other nucleic acid corresponding to the gene) is detected. It should be 

understood by the skilled artisan that many methods for nucleic-acid based detection of 
gene expression exist and that any suitable method for detection can be used. Typical 
assay formats utilize nucleic acid hybridization and include, e.g., 1) nuclear run-on assay, 
2) slot blot assay, 3) northern blot assay, 4) magnetic particle separation, 5) nucleic acid 

1 0 or DNA arrays or chips (also discussed in more detail below), 6) reverse northern blot 
assay, 7) dot blot assay, 8) in situ hybridization, 9) RNase protection assay, 10) ligase 
chain reaction, 11) polymerase chain reaction (PCR), 12) reverse transcriptase (RT)-PCR, 
and 13) differential display RT-PCR (DDRT-PCR) or any combination of any two or 
more of these methods. Such assays can employ the use detectable labels such as 

1 5 radioactive labels, enzyme labels, chemiluminescent labels, fluorescent labels, or other 
suitable labels, to detect, identify, or monitor the presence or level of a particular nucleic 
acid being detected. Such techniques and labels are known in the art and widely 
available to the skilled artisan. 

In an exemplary embodiment, an RNase protection assay can be utilized in the 

20 methods described herein by hybridizing multiple DNA probes corresponding to one or 
more members of a panel of sequences to mRNA isolated from a CNS sample from a 
subject to be tested. The expression profile for one or more genes from the CNS sample 
can be compared to a reference profile, e.g., a basal pattern of expression, or other 
negative or positive control (e.g., a profile from a patient known to have no peripheral 

25 disease, or a standard or average profile derived from subj ects known to not have the 
particular disorder being tested). In one example, the gene expression profile from the 
test CNS sample is compared to a reference gene expression profile that is associated 
with the presence of a non-CNS neoplasia. If the test gene expression profile matches the 
reference gene expression profile, it indicates that the subject has, or is at risk for 

30 developing, the non-CNS neoplastic disorder. 
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The methods described herein are also well suited for polymerase chain reaction 
(PCR)-based methods. PCR-based methods include RT-PCR (U.S. Patent No. 
4,683,202), ligase chain reaction (Barany, Proc. Natl. Acad. Sci. USA, 88:189-193, 
1991), self-sustained sequence replication (Guatelli et al., Proc. Natl. Acad. Sci. USA, 

5 87:1874-1878, 1990), transcriptional amplification system (Kwoh et al., Proc. Natl. Acad. 
Sci. USA, 86:1173-1177, 1989), Q-Beta Replicase (Lizardi et al., BioTechnology, 6:1197, 
1988), rolling circle replication (Lizardi et al., U.S. Patent No. 5,854,033), or any other 
nucleic acid amplification method, followed by the detection of the amplified molecules 
using techniques known in the art. PCR amplification of mRNAs expressed in a CNS 

10 sample can be performed directly from mRNA isolated from the sample, or from cDNA 
reverse-transcribed from such isolated mRNA. The amplified nucleic acid can then be 
hybridized to a particular probe of interest, e.g., a probe for a CNS gene as described 
herein, in order to determine its expression. The probe can be disposed on an address of 
an array, e.g., an array described herein below. Such methods are routine and are 

15 particularly amendable to routine adaptation to automated systems employing computer 
controlled reagent aliquoting and signal detection. See, e.g., Klein et al., Nat. 
Biotechnol, 2002, 20(4):387-92. 

In another embodiment, in situ methods are used to detect the presence or level of 
mRNA corresponding to a particular gene. In such methods, a CNS cell or tissue sample 

20 can be prepared/processed and immobilized on a support, typically a glass slide, and then 
contacted with a probe (e.g., a probe for a CNS gene described herein). 

In still another embodiment, serial analysis of gene expression, as described in 
U.S. Patent No. 5,695,937, is used to detect transcript levels of a CNS gene described 
herein. 

25 

Polvpeptide-Based Methods 

In one embodiment, the methods described herein utilize techniques for detection 
of gene expression where a gene product (polypeptide) encoded by a gene is detected or 
where an activity of the polypeptide, e.g., an enzymatic activity, is detected. Such 
30 methods are particularly advantageous for detecting the expression of genes that encode 
polypeptides that are secreted from CNS cells, e.g., into the CSF. 
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A variety of methods can be used to determine the level of protein encoded by a 
CNS gene. In general, these methods include contacting a CNS sample (such as a brain 
cell sample or a CSF sample) with an agent, such as an antibody, that selectively binds to 
the protein of interest. In one embodiment, the antibody bears a detectable label. 

5 Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a 
fragment thereof (e.g., Fab or F(ab")2) can be used. The term "labeled," with regard to 
the probe or antibody, is intended to encompass direct labeling of the probe or antibody 
by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as 
well as indirect labeling of the probe or antibody by reactivity with a detectable 

10 substance. Such detection methods can be used to detect a CNS gene product in a CNS 
sample in vitro as well as in vivo. 

In vitro techniques include immunoassays such as enzyme linked immunosorbent 
assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay 
(EIA), radioimmunoassay (RIA), Western blot analysis, and Luminex™ x MAP™ 

1 5 detection assay. Some immunoassays are "sandwich" type assays, in which a target 

analyte(s) is "sandwiched" between a labeled antibody and an antibody immobilized onto 
a solid support. The assay is read by observing the presence and amount of antigen- 
labeled antibody complex bound to the immobilized antibody. Another immunoassay 
useful in the methods described herein is a "competition" type immunoassay, wherein an 

20 antibody bound to a solid surface is contacted with a sample (e.g., a CSF sample) 

containing both an unknown quantity of antigen analyte and with labeled antigen of the 
same type. The amount of labeled antigen bound on the solid surface is then determined 
to provide an indirect measure of the amount of antigen analyte in the sample. Such 
immunoassays are readily performed in a "dipstick" format (e.g., a flow-through or 

25 migratory dipstick design) for convenient use. A dipstick-based assay optionally includes 
an internal negative or positive control. Numerous types of dipstick immunoassays 
assays are known in the art and are described, e.g., in U.S. Patents 5,656,448; 4,366,241; 
and 4,770,853. In other embodiments, antibody based assays are performed in an array 
format. For example, a CNS sample is labeled, e.g., biotinylated, and then contacted to 

30 an antibody, e.g., an antibody positioned on an antibody array. The sample can be 
detected, e.g., with avidin coupled to a fluorescent label. 
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In vivo techniques include, e.g., introducing into a subject (e.g., into the CSF) a 
labeled antibody that binds to the gene product to be detected. The antibody can be 
labeled, e.g., with a radioactive marker, whose presence and location in a subject can be 
detected by standard imaging techniques. 

5 Polyclonal and monoclonal antibodies to be used to detect a particular CNS gene 

product will, in certain cases, be available. For example, commercially available 
antibodies exist for many of the CNS marker genes described herein. Alternatively, a 
skilled artisan can make a suitable antibody for use in a diagnostic assay using routine 
techniques. Methods of making and using polyclonal and monoclonal antibodies to 

10 detect a particular target are described, e.g., in Harlow et al, Using Antibodies: A 

Laboratory Manual: Portable Prntnr.nl T Cold Spring Harbor Laboratory (December 1, 
1998). Methods for making modified antibodies and antibody fragments (e.g., chimeric 
antibodies, reshaped antibodies, humanized antibodies, or fragments thereof, e.g., Fab', 
Fab, F(ab')2 fragments); or biosynthetic antibodies (e.g., single chain antibodies, single 

1 5 domain antibodies (DABs), Fv, single chain Fv (scFv), and the like), are known in the art 
and can be found, e.g., in Zola, Monoclonal Antibodies: Preparation and Use of 
Monoclonal Antibodies a nd Engineered Antibody Derivatives. Springer Verlag 
(December 15, 2000; 1st edition). 

20 Imagine of CNS Gene Exp ression 

In one embodiment, the methods described herein utilize techniques for imaging 
of gene expression, e.g., non-invasive imaging of gene expression, in the CNS. For 
example, a labeled probe that is capable of detecting the expression of a target gene can 
be delivered into the brain through the blood-brain barrier (BBB) by targeting the labeled 

25 probe to the brain via endogenous BBB transport systems, such as carrier-mediated 
transport systems that exist for the transport of nutrients across the BBB. Similarly, 
receptor-mediated transcytosis systems operate to transport circulating peptides across 
the BBB, such as insulin, transferrin, or insulin-like growth factors. These endogenous 
peptides can act as "transporting peptides," or "molecular Trojan horses," to ferry a 

30 labeled diagnostic probe as described herein, across the BBB. The label can then be 
detected by known brain imaging techniques. Such an approach is described, e.g., in 
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U.S. Patent No. 6,372,250. In other embodiments, Shi et al., Proc. Natl. Acad. Sci. USA, 
2000, 97(26): 14709- 14 and Lee et al. J. Nucl. Med. 2002, 43(7):948-56 describe imaging 
of gene expression in the brain in vivo using an antisense radiopharmaceutical combined 
with drug-targeting technology to traverse the BBB. 

Other methods of delivering into the brain a labeled probe that is capable of 
detecting the expression of a target gene are described, e.g., in U.S. Pat. No. 5,720,720. 
This patent describes methods of delivering agents (such as labeled antibodies for 
imaging gene products) into the brain by high-flow microinfusion. 



Detection of Chan ges in CNS Gene Expression in Bodily Fluids 
In some cases, gene activation in the CNS can result in a measurable alteration in 
a gene product at a distant site, e.g., in a fluid such as blood, urine or semen. It is known, 
e.g., that the cerebral cortex, hippocampus, entorrhinal cortex, parts of the thalamus, 
basal ganglia, cerebellum and the reticular formation influence the output of the 
autonomic nervous system (Kandel et al, Principles of Neural Science, Third Edition, 
Appleton & Lange). These influences could result in measurable alterations of gene 
expression at the mRNA or protein level in autonomic ganglia or in inervated organs. An 
example of this type of interaction is the immunomodulatory action of the activation of 
the vagus nerve after cytokine release in the periphery (Tracey, Nature, 420:853-9, 2002). 

In addition, gene activation in the CNS can be detected by measuring changes in 
blood proteins in some cases. For example, neurons in the CNS can trigger the release of 
hormones in blood via the activation of several neuroendocrine axes such as the 
hypothalamus-pituitary-adrenal, -gonadal or thyroid axes (Besedovsky and del Rey, 
Endocrine Reviews, 17:1-39, 1996). Moreover, brain extracellular fluid could drain into 
blood and deep cervical lymph (Cserr et al, Brain Pathol., 2(4):269-76, 1992). Cerebral 
extracellular fluids drain from brain to blood across the arachnoid villi and to lymph 
along certain cranial nerves (primarily olfactory) and spinal nerve root ganglia. A 
minimum of 14 to 47% of protein injected into different regions of brain or cerebrospinal 
fluid passes through lymph. Thus, CSF markers could drain into, and be detected in, 
lymph, blood or serum. Such markers found in blood may also be enriched, and thereby 
detectable, in urine, due to selective filtration of blood components by the kidneys. 
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The CNS is connected to the testis via the autonomic nervous system as well as 
the endocrine system. If a change in gene activity in the brain results in modifications in 
the activity of the hypothalamus-pituitary-gonadal axis or in the innervation of the testes, 
these changes could be then detected in fluids related to the testes, such as semen. For 
5 example, patients with spinal cord injury have been shown to have alterations in the 

composition of their semen (See Naderi and Safarinejad, Clin. Endocrinol., 58(2): 177-84, 
2003). 

Routine methods can be used to identify gene products in peripheral tissues, such 
as peripheral bodily fluids, which are the result of changes in gene expression in the 
10 CNS. For example, a candidate marker gene can be disrupted in the brain of an 

experimental animal A change in the expression of a candidate gene in a peripheral 
tissue in the experimental animal, compared to a wild type animal, indicates that the 
expression of the candidate molecule in the peripheral tissue is tied to changes in gene 
expression in the CNS. 

15 

Arrays 

The methods described herein are readily adapted for nucleic acid or protein 
arrays, e.g., nucleic acid and/or protein "chips," following the methods and teachings 
known in the art. In a typical embodiment, an array chip includes multiple probes (e.g., 

20 DNA probes and/or antibody probes) for detection of expression of multiple CNS genes. 
In one embodiment, the probes on a specific chip are chosen to detect the members of 
one or more specific panels or "clusters" of genes, each cluster being associated with a 
specific gene expression profile if a non-CNS neoplasia is present in the subject from 
whom the CNS sample was taken. A chip can contain tens, hundreds, or thousands of 

25 individual probes immobilized (tethered) at discrete, predetermined locations (addresses 
or "spots") on a solid, planar support, e.g., glass, metal, or nylon. An array can be a 
macroarray or microarray, the difference being in the size of the spots. Macroarrays 
contain spots of about 300 microns in diameter or larger and can be imaged using gel or 
blot scanners. Microarrays contain spots less than 300 microns, typically less than 200 

30 microns, in diameter. 
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For analysis and comparison of profiles of gene expression in the methods 
described herein, a nucleic acid array can be constructed using nucleic acid probes for at 
least four, e.g., at least 10, 20, 40, 60, 80 or 100 CNS genes. Such an array can include 
control probes (i.e., probes for genes whose expression is expected to remain unaffected 

5 in a negative sample, e.g., a sample from a subject not having a non-CNS disorder). 
Typically, such controls or "normal" non-disease samples are obtained from healthy 
volunteers. Longitudinal studies of healthy volunteers can be performed to confirm that 
the control samples are from individuals that remained disease free. Such studies can 
provide the raw data for a database of control gene expression profiles. Such a database 

1 0 can provide a source of normal or control "reference" profiles that can be used in the 
present methods. Control samples can also be obtained post-mortem from individuals 
who died for a reason unrelated to the disorder being diagnosed (e.g., individuals who 
died from an accidental trauma). In such cases, post-mortem samples should be taken as 
soon as possible after death, e.g., no later than 3 hours after death. 

1 5 A population of labeled cDNA representing total mRNA from a sample of a tissue 

of interest, e.g., brain, spinal cord or CSF, is contacted with the DNA array under suitable 
hybridization conditions. Hybridization of cDNAs with sequences in the array is 
detected, e.g., by fluorescence at particular addresses on the solid support. Thus, a 
pattern of fluorescence representing a gene expression pattern in the CNS sample of a 

20 particular subject or group of subjects is obtained. These patterns of gene expression can 
be digitized and stored electronically for computerized analysis and comparison. For 
example, an array can be used to compare expression of CNS genes in individuals being 
tested with one or more reference gene expression profiles stored electronically, e.g., in a 
digital database, where the reference gene expression profile is associated with either the 

25 presence or absence of a peripheral neoplasia. 

In some embodiments, cDNAs are used as probes to form the array. Suitable 
cDNAs can be obtained by conventional polymerase chain reaction (PCR) techniques, as 
described above. The length of the cDNAs can be from 20 to 2,000 nucleotides, e.g., 
from 100 to 1,000 nucleotides. Other methods known in the art for producing cDNAs 

30 can be used. For example, reverse transcription of a cloned sequence can be used (for 
example, as described in Sambrook et al., eds., Molecular Cloning: A Laboratory 
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Manual. 2nd ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, NY, 1989). The cDNA probes are deposited or placed ("printed" or 
"spotted") onto a suitable solid support (substrate), e.g., a coated glass microscope slide, 
at specific, predetermined locations (addresses) in a two-dimensional grid. A small 

5 volume, e.g. , 5 nanoliters, of a concentrated DNA solution is used in each spot. Spotting 
can be carried out using a commercial microspotting device (sometimes called an 
arraying machine or gridding robot) according to the vendor's instructions. Commercial 
vendors of solid supports and equipment for producing DNA arrays include BioRobotics 
Ltd., Cambridge, UK; Coming Science Products Division, Acton, MA; GENPAK Inc., 

1 0 Stony Brook, NY; SciMatrix, Inc., Durham, NC; and TeleChem International, Sunnyvale, 
CA. 

The cDNAs can be attached to the solid support by any suitable method. In 
general, the linkage is covalent. Suitable methods of covalently linking DNA molecules 
to the solid support include amino cross-linking and UV crosslinking. For guidance 

1 5 concerning construction of cDNA arrays according to the invention, see, e.g., DeRisi et 
al., Nature Genetics, 1996, 14:457-460; Khan et al., Electrophoresis, 1999, 20:223-229; 
Lockhart et al., Nature Biotechnol., 1996, 14:1675-1680. 

In some embodiments of the invention, the immobilized DNA probes in the array 
are synthetic oligonucleotides. Preformed oligonucleotides can be spotted to form a 

20 DNA array, using techniques described herein with regard to cDNAs. In general, 

however, the oligonucleotides are synthesized directly on the solid support. Methods for 
synthesizing oligonucleotide arrays are known in the art. See, e.g., Fodor et al., U.S. 
Patent No. 5,744,305. The sequences of the oligonucleotides represent portions of the 
sequences of a particular gene to be detected above. Generally, the lengths of 

25 oligonucleotides are 10 to 50 nucleotides, e.g., 15, 20, 25, 30, 35, 40, or 45 nucleotides. 
Also useful in the methods are aptamer arrays. Aptamers are nucleic acid 
molecules that bind to specific target molecules based on their three-dimensional 
conformation rather than hybridization. The aptamers are selected, for example, by 
synthesizing an initial heterogeneous population of oligonucleotides, and then selecting 

30 oligonucleotides within the population that bind tightly to a particular target molecule. 
Once an aptamer that binds to a particular target molecule has been identified, it can be 
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replicated using a variety of techniques known in biological and other arts, e.g., by 
cloning and polymerase chain reaction (PCR) amplification followed by transcription. 
The target molecules can be nucleic acids, proteins, peptides, small organic and inorganic 
compounds, and even entire micro-organisms. 
5 The synthesis of a heterogeneous population of oligonucleotides and the selection 

of aptamers within that population can be accomplished using a procedure known as the 
Systematic Evolution of Ligands by Exponential Enrichment or SELEX. The SELEX 
method is described in, e.g., Gold et al, U.S. Patent Nos. 5,270,163 and 5,567,588; 
Fitzwater et al, ("A SELEX Primer," Methods in Enzymology, 267:275-301, 1996); and 

1 0 in Ellington and Szostak ("In Vitro Selection of RNA Molecules that Bind Specific 
Ligands," Nature, 346:818-22). Briefly, a heterogeneous DNA oligomer population is 
synthesized to provide candidate oligomers for the in vitro selection of aptamers. This 
initial DNA oligomer population is a set of random sequences 1 5 to 100 nucleotides in 
length flanked by fixed 5' and 3' sequences 10 to 50 nucleotides in length. The fixed 

1 5 regions provide sites for PCR primer hybridization and, in one implementation, for 
initiation of transcription by an RNA polymerase to produce a population of RNA 
oligomers. The fixed regions also contain restriction sites for cloning selected aptamers. 
Many examples of fixed regions can be used in aptamer evolution. See, e.g., Conrad et 
al. ("In Vitro Selection of Nucleic Acid Aptamers That Bind Proteins," Methods in 

20 Enzymology, 267:336-83, 1996); Ciesiolka et al, ("Affinity Selection-Amplification 
from Randomized Ribooligonucleotide Pools," Methods in Enzymology, 267:315-35, 
1996); Fitzwater, supra. 

Aptamers are generally selected in a 5 to 100 cycle procedure. In each cycle, 
oligomers are bound to the target molecule, purified by isolating the target to which they 

25 are bound, released from the target, and then replicated by 20 to 30 generations of PCR 
amplification. 

Aptamer selection is similar to evolutionary selection of a function in biology. 
Subjecting the heterogeneous oligonucleotide population to the aptamer selection 
procedure described above is analogous to subjecting a continuously reproducing 
30 biological population to 10 to 20 severe selection events for the function, with each 
selection separated by 20 to 30 generations of replication. 
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Heterogeneity is introduced, e.g., only at the beginning of the aptamer selection 
procedure, and does not occur throughout the replication process. Alternatively, 
heterogeneity can be introduced at later stages of the aptamer selection procedure. 

Various oligomers can be used for aptamer selection, including, e.g., 2'-fluoro- 

5 ribonucleotide oligomers, NH2-substituted and OCH3-substituted ribose aptamers, and 
deoxyribose aptamers. RNA and DNA populations are equally capable of providing 
aptamers configured to bind to any type of target molecule. Within either population, the 
selected aptamers occur at a frequency of 109 to 1013, see Gold et al., ("Diversity of 
Oligonucleotide Functions," Annual Review of Biochemistry, 64:763-97, 1995), and most 

1 0 frequently have nanomolar binding affinities to the target, affinities as strong as those of 
antibodies to cognate antigens. See Griffiths et al., (EMBO J., 13:3245-60, 1994). 

Using 2'-fluoro-ribonucleotide oligomers is likely to increase binding affinities 
ten to one hundred fold over those obtained with unsubstituted ribo- or deoxyribo- 
oligonucleotides. See Pagratis et al. ("Potent 2'-amino and 2' fluoro 

1 5 2'deoxyribonucleotide RNA inhibitors of keratinocyte growth factor" Nature 

Biotechnology, 15:68-73). Such modified bases provide additional binding interactions 
and increase the stability of aptamer secondary structures. These modifications also 
make the aptamers resistant to nucleases, a significant advantage for real world 
applications of the system. See Lin et al. ("Modified RNA sequence pools for in vitro 

20 selection" Nucleic Acids Research, 22:5229-34, 1 994); Pagratis, supra. 

In the present invention, aptamers can be used to detect, e.g., mRNAs, cDNAs, or 
proteins corresponding to CNS marker genes. 

In some embodiments of the invention, probes (e.g., nucleic acid probes, 
antibodies, or aptamers) for the human homologs of CNS genes are used in the detection 

25 method. In other embodiments, the probe used for detection consists of highly conserved 
regions of a gene, e.g., a sequence that is highly conserved between homologous mouse 
and human sequence. 

Sample Preparation and Analysis 
30 In methods of the invention, the transcription level of one or more CNS genes is 

assumed to be reflected in the amount of its corresponding mRNA present in cells of an 
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assayed CNS sample. In general, mRNA from the CNS cells or tissue is copied into 
cDNA under conditions such that the relative amounts of cDNA produced representing 
specific genes reflect the relative amounts of the mRNA in the sample. Comparative 
hybridization methods involve comparing the amounts of various, specific mRNAs in 

5 two tissue samples, as indicated by the amounts of corresponding cDNAs hybridized to 
sequences from the genes of interest. 

The mRNA used to produce cDNA is generally isolated from other cellular 
contents and components. One useful approach for mRNA isolation is a two-step 
approach. In the first step, total RNA is isolated. The second step is based on 

10 hybridization of the poly(A) tails of mRNAs to oligo(dT) molecules bound to a solid 
support, e.g., a chromatographic column or magnetic beads. Total RNA isolation and 
mRNA isolation are known in the art and can be accomplished, for example, using 
commercial kits according to the vendor's instructions. Similarly, synthesis of cDNA 
from isolated mRNA is known in the art and can be accomplished using commercial kits 

1 5 according to the vendor's instructions. Fluorescent labeling of cDNA can be achieved by 
including a fluorescently labeled deoxynucleotide, e.g., Cy5-dUTP or Cy3-dUTP, in the 
cDNA synthesis reaction. For guidance concerning isolation of mRNA and synthesis of 
fluorescently labeled cDNA for analysis on a DNA array, see, e.g., Ross et al., Nature 
Genetics 2000, 24:227-235. 

20 In the invention, conventional techniques for hybridization and washing of DNA 

arrays, detection of hybridization, and data analysis can be employed routinely without 
undue experimentation. Commercial vendors of hardware and software for scanning 
DNA arrays and analyzing data include Cartesian Technologies, Inc. (Irvine, CA); GSI 
Lumonics (Watertown, MA); Genetic Microsystems Inc. (Wobum, MA); and Scanalytics, 

25 Inc. (Fairfax, VA). 

In other embodiments, the expression level of one or more CNS genes is reflected 
in the presence and/or level of protein present in cells of a CNS sample to be assayed. 
The presence or level of protein in a CNS sample can be detected by routine methods. 
For example, a CNS sample (e.g., a CSF sample) can be analyzed by gel electrophoresis 

30 techniques such as 2-dimensional (2D) PAGE. Once protein spots are separated on a 20- 
PAGE gel, differentially expressed spots can be identified, e.g., by matrix assisted laser 
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desorption ionization time of flight (MALDI-TOF) and electrospray ionization (ESI). 
This method can also be used for peptide analysis to provide the fingerprint of a 
particular protein in a sample. 

A second proteomic approach can involve obtaining a proteomic spectrum by 
5 directly analyzing a CNS sample, such as a CSF sample, by mass spectroscopy. For 
example, surface enhanced laser desorption ionization time of flight (SELDI-TOF) 
analysis can be performed to generate a proteomic pattern from a CNS sample. 
SELDI-TOF analysis has been shown to be able to identify a cluster pattern that 
differentiates between normal and disease patients. See, Paweletz et al., Dis. Markers, 
10 17(4):301-7, 2001. 

Generating Gene Expression Profiles 

A gene expression profile used in the methods described herein is a pattern of 
expression of two or more CNS genes. In some cases, an expression profile can be a 

15 pattern of expression of 5, 10, 25, 50, 100, 200, 500, or more genes. A "reference gene 
expression profile" as used herein is a characteristic pattern of expression of two or more 
CNS genes, where the pattern of expression is associated with risk or presence of a 
particular disorder. The association between the characteristic profile and the particular 
disorder is determined through the generation and analysis of CNS gene expression data 

20 to mine and identify correlations between particular patterns of CNS gene expression 
(e.g., relative increases and/or decreases of gene expression of particular genes compared 
to a negative control) and particular clinical states. For example, a reference gene 
expression profile can be a set of genes (also referred to herein as a "panel" or "cluster" 
of genes), where each gene of the set is either down-regulated or upregulated when 

25 associated with a specific peripheral disorder or any peripheral disorder. A reference 
profile can also include a value, e.g., a relative value, of gene expression for two or more 
genes in a panel, where at least one gene of the panel is down-regulated and at least one 
gene is up-regulated. An example of such a gene expression profile is a profile that 
includes a value for the relative differential expression of at least 2, e.g., between 2 and 

30 50, of the genes shown in any of the tables of FIG. 24A-C or between two and seven of 
the genes listed in FIG. 26. Such a reference profile is associated with the presence of 
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early stage carcinoma. Other examples are provided by each of the clusters disclosed in 
FIGS. 2-26. For example, clusters 9 and 10 of FIG. 13(1-5) each provide a profile or 
panel of genes that are strongly down-regulated in the cortex in response to the presence 
of lung cancer. 

5 Exemplary gene expression profiles associated with non-CNS carcinoma (or 

particular types of non-CNS carcinoma, such as breast, lung or colon carcinoma) are 
shown in FIGS. 2-26. A reference gene expression profile can include at least a portion 
of the genes or gene products shown in these figures. For example, a reference gene 
expression profile associated with lung carcinoma can include a value for the differential 

10 expression of 1, 2, 5, 10, 20, 30, 40, 50, or more, genes or gene products listed as CNS 
markers for lung carcinoma in FIG. 24C. In another example, a reference gene 
expression profile associated generally with carcinoma can include a value for the 
differential expression of between one and seven genes or gene products listed as CNS 
markers for carcinoma in FIG. 26. The reference profiles that can be used with the 

15 methods of the invention are not limited by the CNS markers described herein. 

Reference profiles can be generated by detecting changes in patterns of gene 
expression in the CNS in response to the presence of non-CNS disease in an experimental 
animal, and identifying the human homologs of the genes and gene clusters that are 
differentially expressed in a certain pattern in the experimental samples, as exemplified in 

20 Examples 1 -3 described herein. A reference gene expression profile can also be obtained 
by evaluating human CNS gene expression data. For example, a database can be created 
and maintained where CNS gene expression data is obtained and stored, e.g., digitally or 
electronically, for tens, hundreds, or thousands of individuals. The individuals can be 
followed and evaluated with regard to cancer clinical state longitudinally (e.g., at least 5 

25 years, 10 years, 15 years, 20 years, 30 years, 50 years or a lifetime). The expression 
profiles of individuals who developed a particular disease, e.g., 5, years, 10 years, 15 
years, 20 years, 30 years, or 50 years after the CNS gene expression data was obtained, 
can be compared with the expression profiles of individuals who remained disease free. 
Similar comparison can be made between individuals who developed one clinical type of 

30 the disorder compared to another, or individuals who developed the disease at an early 
age versus a late age. These analyses can provide specific reference CNS gene 
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expression profiles that are associated with different stages of disease, e.g., different 
stages of neoplasia, or different types of tumors. 

A "test gene expression profile" is obtained from a CNS sample of a subject to be 
tested for the presence of peripheral disease. First, a CNS sample, e.g., a brain cell 

5 sample or CSF sample, is obtained from the subject by routine means such as brain 
needle biopsy (for a brain cell sample) or a lumbar puncture (for CSF), as described 
herein. The sample is then prepared for use in a method of detecting gene expression, 
e.g., any method of detecting gene expression described herein. In one embodiment, total 
RNA can be prepared from the sample, and reverse transcribed into cDNA for use in a 

10 nucleic acid array assay described herein. In another embodiment, total protein is 

prepared from the sample for use in an antibody assay described herein. The prepared 
sample can then be contacted with an array (e.g., an antibody or nucleic acid array) that 
can detect expression levels (or protein levels in the case of an antibody array) of at least 
one cluster or panel of CNS genes or gene products corresponding to the cluster or panel 

1 5 of CNS genes or gene products of one or more particular reference gene expression 

profiles to which the test sample will be compared. For example, a prepared CNS sample 
from the test subject can be contacted with a nucleic acid array containing nucleic acid 
probes or an antibody array containing antibody probes for two or more, e.g., between 2 
and 50, between 2 and 100, or between 10 and 500, of the genes shown in FIGS. 2-26. 

20 In one embodiment, the array can contain probes for each of the marker genes in a 
particular cluster disclosed in any of FIGS. 2-26. 

The results of the array assay are obtained by routine techniques, such as 
fluorescence detection and measurement of bound antibody or hybridized nucleic acid for 
each position (each probe) on the array. A dataset of the values for the level of each 

25 polypeptide or gene detected in the CNS sample by each antibody or probe on the array 
can then be generated. The dataset can contain information such as patient identifier, and 
actual and/or relative levels of expression or protein detected. Such a dataset can be 
used directly as the "test gene expression profile" or the dataset can be converted into a 
format comparable to the format of the reference profile. 

30 Once the test expression profile is generated, a test profile can be compared to a 

reference expression profile as described herein. 
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Analyzing Gene F. xpression Profiles 

The invention also features methods of evaluating a subject by comparing a test 
gene expression profile from a test subject with a reference gene expression profile, e.g., 
5 a negative control ("normal") gene expression profile associated with the absence of a 
particular non-CNS disorder or a positive control gene expression profile associated with 
the presence of the disorder. Longitudinal studies of CNS gene expression in multiple 
volunteers can be performed to identify and confirm reference gene expression profiles 
that are associated with individuals who remain disease free or individuals who get the 

10 disease. Such studies can provide the raw data for a database of negative and positive 
control gene expression profiles that can be used in the present methods. 

Subject "test" and "reference" profiles can be obtained by methods described 
herein. In one embodiment, the method includes obtaining a CNS sample from a subject 
(either directly or indirectly from a caregiver or other party), creating an expression 

1 5 profile from the sample, and comparing the subject's expression profile to one or more 
reference profiles and/or selecting a reference profile most similar to that of the subject. 

As with other detection methods, profile-based assays can be performed prior to 
the onset of symptoms (in which case they can be diagnostic), prior to treatment (in 
which case they can be prognostic) or during the course of treatment (in which case they 

20 serve as monitors) (see, e.g., Golub et al., 1999, Science 286:531). 

A variety of routine statistical measures can be used to compare two gene 
expression profiles. One possible metric is the length of the distance vector that is the 
difference between the two profiles. Each of the test and reference profile is represented 
as a multi-dimensional vector, wherein each dimension is a value in the profile, e.g., a 

25 value for the expression of a particular gene in a panel. A test profile and reference 
profile can be said to match if they are at least 70% identical in reference to the number 
of genes having similar expression patterns in each profile, or to the level of expression 
of the genes in each profile. In one embodiment, a test and reference profile are said to 
match if their respective multi-dimensional vectors, as described above, have a 30% or 

30 lower variance with respect to each other. If the test and reference profile match, the test 
subject can be identified as having the peripheral disorder with which the reference 
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profile is associated. If the test and normal profile match, the subject is likely to be free 
of the peripheral disorder. 

In one embodiment, pattern recognition software is used to identify matching 
profiles. For example, unsupervised clustering algorithms, such as hierarchical 

5 clustering, K-means clustering, and SOM (self-organizing maps) for pattern discovery 
can be used. Supervised techniques such as SVM (support vector machines) and 
SPLASH (structural pattern localization analysis by sequential histograms) algorithms 
implemented in the Genes@Work software package (IBM Corp.) can also be used. 
In another embodiment, gene expression profiles are analyzed by quantitative 

1 0 pattern comparison performed by applying a nearest neighbor classifier (see Jelinek et al., 
Mol. Cancer Res., 1 :346-6 1 , 2003). Based on the nearest neighbor classifier a score is 
defined which, together with a permutations-derived distribution, can be used to estimate 
the probability of each test profile of belonging to a class defined by a reference gene 
expression pattern (see Jelinek, supra). 

1 5 The result of the diagnostic test, which can be transmitted to the subject, a 

caregiver, or another interested party, can be the subject expression profile per se, a result 
of a comparison of the subject expression profile with another profile, a most similar 
reference profile, or a descriptor of any of these. Transmission can occur across a 
computer network (e.g., in the form of a computer transmission such as a computer data 

20 signal embedded in a carrier wave). Accordingly, the invention also features a computer 
medium having executable code for effecting the following steps: receive a subject 
expression profile; access a database of reference expression profiles; and either i) select 
a matching reference profile most similar to the subject expression profile, or ii) 
determine at least one comparison score for the similarity of the subject expression 

25 profile to at least one reference profile. The subject expression profile and the reference 
expression profile each include a value representing the level of expression of one or 
more of the identified genes or gene products or the proteins they encode. 

Predictive Medicine 

The methods described herein are generally useful in the field of predictive 
30 medicine and, more specifically, are useful in diagnostic and prognostic assays, in 
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monitoring progression of a disease, e.g., neoplasia, or monitoring of response to 
treatment, e.g., in clinical trials. For example, one can determine whether a subject has a 
very early stage neoplasia, in the absence of other, e.g., clinical, indications of neoplasia. 
In another example, one can determine whether a subject is at risk for developing 
5 rheumatoid arthritis or whether the subject has early stage RA, in the absence of clinical 
indications of RA such as joint inflammation. The methods are particularly useful, e.g., 
for patients who have had surgery or treatment for the disease (e.g., to remove cancer), in 
which case the methods could be used to monitor recurrence or metastasis, for persons 
living in regions of high incidence of cancer due, e.g., to environmental factors, or for 

10 individuals who have a family history of a disease (e.g., diabetes, asthma or cancer) or 
are carriers of a disease susceptibility gene, e.g., a cancer susceptibility gene (e.g., 
BRCA1 or BRCA2, hMSH2, MLH1, MSH2, or MSH6). Other cancer susceptibility 
genes are described in The Genetic Basis of Human Cancer. 2nd edition (Vogelstein and 
Kinzler, Eds.), McGraw-Hill Professional (2002). Such individuals can be evaluated 

1 5 using the methods described herein. 

In some cases, for example, where the risk of developing a disease is high (e.g., 
where an individual has a strong family history of asthma, or where an individual carries 
a cancer susceptibility gene or lives in a high risk area for cancer), an individual an be 
evaluated periodically (e.g., every 10 years, every 5 years, or every year) during his 

20 lifetime. 

The "subject" referred to here, and that is referred to in the context of any of the 
methods of the invention, is a vertebrate animal, typically a mammal. The subject can be 
an experimental animal (e.g., an experimental rodent such as a rat or mouse), a 
domesticated animal (e.g., a dog or cat); an animal kept as livestock {e.g., a pig, cow, 
25 sheep, goat, or horse); a non-human primate (e.g., an ape, monkey, or chimpanzee). The 
animal can be an unborn animal (accordingly, the methods of the invention can be used to 
carry out genetic screening or to make prenatal diagnoses). Of course, the subject can 
also be, and typically is, a human. 
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Commiter-Readahlp. Medium 

In another aspect, the invention features a computer-readable medium having a 
plurality of digitally encoded data records. Each data record includes a value 
representing the level of expression of a CNS gene, and a descriptor of the sample. The 
5 descriptor can be, e.g., an identifier (e.g. , an identifier for the patient from which the 
sample was obtained, e.g., a name or a reference code that can be matched with patient 
information only by those having access to a decoding table), a diagnosis made, or a 
treatment to be performed in the event the level of expression reaches a certain level or 
falls below a certain level. The data record can also include values representing the level 

10 of expression of related genes (e.g., the data record can include values for each of a 
plurality of genes in a gene "cluster," where a particular reference profile of gene 
expression for the genes in the cluster is associated with a peripheral disorder). The data 
record can also include values for control genes (e.g., genes whose expression is not 
changed in control samples or whose expression is not diagnostically correlated with a 

1 5 peripheral disorder). The data record can be structured as a table (e.g., a table that is part 
of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase 
database environments)). 

Non-CNS Diseases 

20 The methods described herein are not limiting in that they can be used to 

diagnose, monitor, or treat any non-CNS disorder, such as a neoplasia (e.g., tumor or 
cancer); an immune disorder (e.g., an autoimmune disorder such as rheumatoid arthritis, 
multiple sclerosis, systemic lupus erythematosus, psoriasis, scleroderma); an allergic or 
inflammatory disorder (e.g., asthma, inflammatory bowel disease, Crohn's disease); a 

25 metabolic or endocrine disorder (e.g., diabetes, obesity, Addison's disease); a pathogenic 
infection (e.g., a viral, parasitic or fungal infection, e.g., HIV infection); or a 
cardiovascular disorder. 

As used herein, "neoplasia" refers to the uncontrolled and progressive 
proliferation of cells under conditions that would not elicit, or would cause cessation of, 

30 proliferation of normal cells. Neoplasia results in the formation of a "neoplasm," which 
is defined herein to mean any new and abnormal growth, particularly a new growth of 
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tissue, in which the growth is uncontrolled and progressive. Neoplasm, as used herein, is 
synonymous with "tumor." Malignant neoplasms or tumors are distinguished from 
benign in that the former show a greater degree of anaplasia, or loss of differentiation and 
orientation of cells, and have the properties of invasion and metastasis. Thus, neoplasia 
5 includes "cancer," which herein refers to a proliferation of cells having the unique trait of 
loss of normal controls, resulting in unregulated growth, lack of differentiation, local 
tissue invasion, and metastasis. The methods described herein can be used to diagnose 
neoplasia from any non-CNS cell or tissue type, such as neoplasia derived from epithelial 
or endocrine tissue, mesenchymal tissues, or hematopoietic tissue. 

1 0 The term "carcinoma" is art recognized and refers to malignancies of epithelial or 

endocrine tissues including respiratory system carcinomas, gastrointestinal system 
carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, 
prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary 
carcinomas include those forming from tissue of the colon, lung, prostate, breast, cervix, 

1 5 head and neck, and ovary. The term also includes carcinosarcomas, which include 
malignant tumors composed of carcinomatous and sarcomatous tissues. An 
"adenocarcinoma" refers to a carcinoma derived from glandular tissue or in which the 
tumor cells form recognizable glandular structures. 

The term "sarcoma" is art recognized and refers to malignant tumors of 

20 mesenchymal derivation. 

As used herein, the term "hematopoietic neoplastic disorders" includes diseases 
involving hyperplastic/neoplastic cells of hematopoietic origin, e.g., arising from 
myeloid, lymphoid or erythroid lineages, or precursor cells thereof. The disorders can 
arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute 

25 megakaryoblastic leukemia. Exemplary myeloid disorders include, but are not limited to, 
acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic 
myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in 
OncolVHemotol. 1 1 :267-97); lymphoid malignancies include, but are not limited to acute 
lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, 

30 chronic lymphocytic leukemia (CLL), prolymphocyte leukemia (PLL), hairy cell 
leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of 
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malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and 
variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), 
cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), 
Hodgkin's disease and Reed-Stemberg disease. 

5 

Identification Of C NS Marker Genes for Non-CNS Disorders 

Also featured in the invention are methods of identifying a CNS diagnostic 
marker for a non-CNS disorder in a subject. Generally, such methods involve detecting 
changes in gene expression in the CNS in response to the presence of a particular non- 
10 CNS disease condition in a subject, e.g., an experimental animal. The methods will 
generally involve inducing a disease condition or disorder in a test experimental animal; 
and comparing the expression of at least one gene in a CNS sample from the test 
experimental animal to expression of the gene in a CNS sample from a control 
experimental animal. A gene (or a human homolog of a gene) that is differentially 
1 5 expressed in the CNS sample from the test experimental animal compared to the CNS 
sample from the control experimental animal can be identified as a CNS diagnostic 
marker for a non-CNS disorder. Such markers are referred to herein as CNS "marker 
genes" or "disease surveillance genes" for non-CNS disease. It is understood, however, 
that the gene product of the marker gene can also serve as a diagnostic marker. In most 
20 cases, a plurality of differentially expressed markers are identified (e.g., a "profile" or 
"cluster" of markers is identified). The experimental animal is preferably an 
experimental mammal, and can be, e.g., an experimental rodent (e.g., a rat, mouse or 
guinea pig) or non-human primate (e.g., an ape, e.g., a monkey or chimpanzee). 

The methods of detection of gene expression described herein, and particularly 
25 array and chip technology, are useful for methods of identifying CNS marker genes for 
non-CNS neoplasia. CNS samples are prepared from experimental and control animals 
(e.g., brains are biopsied or removed, or CSF samples are taken) and RNA, cDNA or 
protein is prepared from the samples as described herein. A single chip (e.g., a 
commercially available chip having probes for a large number of genes in the genome of 
30 the experimental animal species) can allow measurement of the level at which hundreds, 
thousands, or even tens of thousands of genes are expressed in the CNS sample of a test 
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experimental animal compared to a control experimental animal. Typically, clustering 
methodology or other bioinformatics tools are used to mine the data obtained from such 
large scale experiments and identify the genes or clusters of genes that are statistically 
significantly differentially expressed in an experimental sample compared to a control 
5 sample. Many such tools and programs are available to the skilled artisan. An 

exemplary method of data analysis is described herein and exemplified in the Examples 
below. 

CNS Marker Genes for Neoplasia 

In one embodiment, identifying a CNS diagnostic marker for a non-CNS 
10 neoplastic disorder involves detecting changes in gene expression in the CNS in response 
to the presence of a non-CNS neoplasm in an experimental animal. For example, a 
neoplasm is induced in an experimental animal and gene expression in the CNS of the 
experimental animal is evaluated compared to a control animal. Methods for inducing 
growth of a non-CNS neoplasm, e.g., a cancer, in an experimental animal, are known in 
15 the art and include, e.g., chemical or radiation mutagenesis, or transplantation of a 
neoplastic cell (e.g., a neoplastic cultured cell or cell line) to the experimental animal. 
CNS genes or gene products whose expression is altered in the experimental animal 
compared to a control animal are identified as CNS markers or surveillance genes for 
neoplasia. Examples of CNS marker genes for cancer, particularly for carcinoma, are 
20 provided herein by FIGS. 2-26 and Examples 1 -3. 

CNS Mark er Genes for Rheumatoid Arthritis 

In another embodiment, identifying a CNS diagnostic marker for rheumatoid 
arthritis (RA) involves detecting changes in gene expression in the CNS in an animal 

25 model of RA compared to a wild type animal. For example, the art-recognized rodent 
collagen induced arthritis (CIA) model can be used. In this model, arthritis is induced in 
a rodent, e.g., a DBA /l mouse, by intradermal injection of purified collagen. 100 jag of 
purified type II collagen emulsified in complete adjuvant is typically injected at the base 
of the tail. Onset of arthritis is macroscopically visible as paw swelling or redness 

30 approximately three weeks after immunization (Williams et al., 1 992, Proc. Natl. Acad. 
Sci. (USA), 89:9784-9788). Clinical features of arthritis are monitored by quantitatively 
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assessing paw swelling (e.g., with calipers) over a period of time. Severity of arthritis is 
assessed according to established clinical scores (Williams et al., 1995, Eur. J. 
Immunolo., 25:763-769). CNS genes or gene products whose expression is altered in the 
CIA animal compared to a control animal are identified as CNS markers or surveillance 
5 genes for RA. 

Given the involvement of Thl lymphocytes and B cells, proinflammatory 
cytokines, and a possible mimicry of bacterial LPS in disease evolvement, it is likely that 
genes that regulate these processes are candidates to be involved in early RA surveillance 
in the CNS. For example, proinflammatory cytokines produced in the brain such as 

10 IL-ip, TNF, IL-18, IFN-y, IL-12, gpl30; cytokines such as IL-6 and leukemia inhibitory 
factor (LIF); neurotransmitters and neurotrophic factors such as N-methyl-D-aspartate 
(NMD A), brain-derived neurotrophic factor (BDNF), glial cell line-derived neurotrophic 
factor (GDNF), nerve growth factor (NGF); inhibitors of cytokines such as prostaglandin 
E2 (PGE2) and SOCS-1 and -3; SOCS regulators such as cAMP-inducing central 

1 5 peptides; brain molecules that are produced as a result of cytokine action, such as 
pentraxin 3 (PTX3); hormone releasing factors such as cortocotropin; corticotropin- 
releasing hormone (CRH) and other hormones involved in the regulation of the HP A 
axis; pituitary corticotroph proteins such as POMC; molecules involved in NF-kB- 
mediated signaling of inflammatory response; and other members of the families of these 

20 genes, as well as inducers and stimulators of these proteins, may be disease-surveillance 
genes for RA. See, e.g., See, e.g., Blond et al., 2002, Brain Res., 958(l):89-99; Suk et 
al., 2001, Immunol. Lett., 77(2):79-85; Losy et al., 2001, Acta Neurol. Scand., 
104(3):171-3; Opp et al., 2001, Neuroendocrinology, 73(4):272-84; Chesnokova et al., 

2002, Endocrinology, 143(5): 157 1-4; Bousquet et al., 2002, Mol. Endocrinol., 

25 15(1 1):1880-90; Polentarutti et al., 2000, J. Neuroimmunol., 106(l-2):87-94; Bayas et al., 

2003, Neurosci. Lett. 335(3):155-8; Xu et al., 2000, Acta Pharmacol. Sin. 21(7):600-4; 
Fang et al., 2000, Neuroreport, 11(4):737-41). 

CNS Marker Genes for Asthma 
30 In another embodiment, identifying a CNS diagnostic marker for asthma involves 

detecting changes in gene expression in the CNS in an animal model of asthma compared 
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to a wild type animal. Several experimental models of asthma are known in the art, 
including rodent, sheep, and non-human primate models (for a review, see Isenberg-Feig 
et al., 2003, Curr. Allergy Asthma Rep. 3(l):70-8). Any of these can be used in the 
present methods. In one embodiment, the experimental model of asthma is performed 

5 according to Komai et al. (2003, Br. J. Pharmacol., 138(5):912-20). In brief, Balb/c 
mice are sensitized by intraperitoneal administration of 50 ug of ovalbumin combined 
with 1 mg of alum (Al(OH)3) on day 0 and 12. From day 22 to 43 animals are exposed 
to daily aerosol challenges of 1% w/v of ovalbumin for 30 minutes. Control animals can 
include saline-injected animals and animals sensitized with ovalbumin and alum and 

1 0 challenged with saline. Airway function is evaluated by measuring one or more of: 
airway responsiveness to acetylcholine; IL-4, IL-5, and/or IL-13 levels; interferon-y 
levels; eosinophil numbers in bronchoalveolar fluids; specific IgGl and IgG2a levels in 
sera; lung histology; and rectal temperature. CNS markers or surveillance genes for 
asthma are those whose expression is altered in the asthma model animal compared to a 

1 5 control animal, or those whose expression is altered after aerosol challenge compared to 
before aerosol challenge. 

Several gene products associated with the CNS have been shown to influence the 
Th-2 response and are candidates as disease-surveillance genes. These include 
glucocorticoid, one of the main hormonal mediators of stress, which acts on antigen- 

20 presenting cells to suppress the production of IL- 1 2 in vitro and ex vivo; 

neurotransmitters norepinephrine or epinephrine; p-adrenoreceptor (ARs) agonists and 
antagonists (e.g., propranolol); modulators of neurotransmission such as adenosine and 
adenosine analogues; opiod system components, which influence the immunological 
response in general and the Th-l/Th-2 balance in particular; mediators of allergic 

25 reactions, such as histamine; neuropeptides such as substance P, vasoactive intestinal 
peptide and somatostatin, which increase the release of histamine from mast cells. See 
Blottaetal., 1997, J. Immunol. 158: 5589-5595; Elenkovet al., 1996, Proc. Assoc. Am. 
Physicians, 108: 374-381 ;Cooper et al., The biochemical basis of Neuropharmacolop v. 
Oxford University Press, 1996, p. 123; Link et al., 1999, J. Immunol. 164: 436-442; 

30 Loizzo et al., 2002, Br. J. Pharmacol., 135(5):1219-26; Lowman et al., 1988, British 
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Journal of Pharmacology, Vol 95:121-130; and Elenkov et al., Annals of the New York 
Academy of Sciences, 2000, 917:94-105. 

CNS Marker Genes for Diabetes 
5 In another embodiment, identifying a CNS diagnostic marker for diabetes 

involves detecting changes in gene expression in the CNS in an animal model of diabetes 
compared to a wild type animal. Several experimental models of diabetes are known in 
the art, e.g., spontaneous models such as the NOD Mouse and BB Rat, and inducible 
models such as streptozotocin-induced (STZ) Diabetic Rats. These are reviewed in 
10 Cheta, 1998, J. Pediatr. Endocrinol. Metab., 1 l(l):l 1-9. CNS markers or surveillance 
genes for diabetes are those whose expression is identified to be altered in an induced 
animal compared to an uninduced animal (e.g., a streptozotocin-fed STZ rat compared to 
a control fed STZ rat), or those whose expression is altered in the early stages of 
spontaneous progression of disease. 

15 

CNS Marker Genes for Obesity 

In yet another embodiment, identifying a CNS diagnostic marker for propensity 
for obesity involves detecting changes in gene expression in the CNS in an animal model 
of obesity, e.g., comparing CNS gene expression in an obesity-prone animal before and 

20 after obesity develops or is clinically detectable. The method can involve comparing 
differences in CNS gene expression between mouse strains that are either prone to 
obesity or resistant to obesity after being exposed to a fat-rich diet. For example, the 
method can employ the C57BL/KsJ(KsJ) or A/J strain of mice, both of which are 
resistant to the development of dietary obesity, or the obesity-prone strain C57BL/6J 

25 (B6). 

Possible disease-surveillance genes for obesity or loss or body weight control 
include leptin, leptin receptor, ghrelin, cholecystokinin (CCK), CCK-A receptor, 
neuropeptide Y (NPY), proopiomelanocortin (POMC), a-melanocyte stimulating 
hormone (a-MSH), and other molecules that participate in the central control of energy 
30 balance. Given the fact that so many gene products orchestrate behaviors related to food 
intake, genetic deficiencies or the presence of particular polymorphic alleles in one or 
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more of these genes may induce disorders in the control of energy homeostasis leading to 
obesity. Such a deficiency or disruption in the normal signaling of such molecules can 
likely trigger an early signal that alters CNS gene expression. 

Isolating Homologous Sequences from Other Species 

The human homologs of CNS marker genes and their products (e.g., human 
homologs of CNS marker genes identified by experiments in non-human experimental 
animals) are useful for various embodiments of the methods described herein. Human 
homologs are known for most of the CNS marker genes provided herein. In those cases 
where a human homolog is not identified, several approaches can be used to identify such 
genes. These methods include low stringency hybridization screens of human libraries 
with a mouse marker gene nucleic acid sequence, polymerase chain reactions (PCR) of 
human DNA sequence primed with degenerate oligonucleotides derived from a mouse 
marker gene, two-hybrid screens, and database screens for homologous sequences. 

Therapeutic Methods 

The methods described herein can identify or diagnose the presence of a non-CNS 
disorder in a subject at an early stage in the pathogenic process. As such, the methods 
allow for early intervention, which can be the key to successful treatment and/or 
management of many disorders. For example, if a propensity for obesity or diabetes can 
be diagnosed at an early stage using the methods described herein, simple lifestyle or 
nutritional changes may be sufficient to stop or slow the progress of the disease, where 
such changes would not be sufficient if the disease were diagnosed at a later, more 
progressive stage. Similarly, a neoplasia that is detected at an early stage is more likely 
to be treated with less toxic therapeutic agents, or lower doses of a therapeutic agent, than 
would be used at a stage of advanced neoplasia, e.g., cancer. 

Chemotherapeutic Agents 

In one embodiment, the methods described herein can identify or diagnose the 
presence of a non-CNS neoplasia in a subject at an early stage, e.g., before a neoplasm 
has formed, before a neoplasm is clinically detectable, and/or before a tumor has become 
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malignant. As such, a neoplasm detected by a method described herein is amenable to 
treatment by an agent that targets neoplastic cells in general or targets specific neoplastic 
cells in particular. In one embodiment, a subject may be treated with a chemotherapeutic 
agent. Chemotherapeutic agents, as used herein, refer to chemical therapeutic agents or 
5 drugs used in the treatment of neoplasia. This term is used for simplicity notwithstanding 
the fact that other compounds may be technically described as chemotherapeutic agents 
in that they exert an anti-cancer effect. A number of exemplary chemotherapeutic agents 
are described below. 

Suitable chemotherapeutic agents include: antitubulin/antimicrotubule drugs, 

10 e.g., paclitaxel, taxol, tamoxifen, vincristine, vinblastine, vindesine, vinorelbin, taxotere; 
topoisomerase I inhibitors, e.g., topotecan, camptothecin, doxorubicin, etoposide, 
mitoxantrone, daunorubicin, idarubicin, teniposide, amsacrine, epirubicin, merbarone, 
piroxantrone hydrochloride; antimetabolites, e.g., 5-fluorouracil (5-FU), methotrexate, 
6-mercaptopurine, 6-thioguanine, fludarabine phosphate, cytarabine/Ara-C, trimetrexate, 

1 5 gemcitabine, acivicin, alanosine, pvrazofurin, N-Phosphoracetyl-L- Asparate=P ALA, 
pentostatin, 5-azacitidine, 5-Aza 2'-deoxycytidine, ara-A, cladribine, 5 - fluorouridine, 
FUDR,tiazonjrin,N-[5-[N-(3,4-dihydro-2-methyl-4-oxoquinazolin-6-ylmethyl)-N- 
methylamino]-2-thenoyl]-L-glutamic acid; alkylating agents, e.g., cisplatin, carboplatin, 
mitomycin C, BCNU=Carmustine, melphalan, thiotepa, busulfan, chlorambucil, 

20 plicamycin, dacarbazine, ifosfamide phosphate, cyclophosphamide, nitrogen mustard, 
uracil mustard, and pipobroman, 4-ipomeanol; estrogen modulators, e.g., raloxifene; 
piroxicam; 9-cis retinoic acid. 

Suitable dosages for the selected chemotherapeutic agent are known to those of 
skill in the art. For example, where the agent is doxorubicin, suitable dosage may include 

25 30 mg/m 2 of patient skin surface area, administered intravenously, twice at 1 week 

intervals. However, one of skill in the art can readily adjust the route of administration, 
the number of doses received, the timing of the doses, and the dosage amount, as needed. 
Bearing in mind these considerations, generally, a suitable dose for a given 
chemotherapeutic agent is between 10 mg/m 2 to about 500 mg/m 2 , and more preferably, 

30 between 50 mg/m 2 to about 250 mg/m 2 of patient skin surface area (the skin surface of an 
average sized adult human is about 1 .8 m 2 ). Such a dose, which may be readily adjusted 
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depending upon the particular drug or agent selected, may be administered by any 
suitable route, including, e.g., intravenously, intradermally, by direct site injection, 
intraperitoneally, intranasally, or the like. Doses may be repeated as needed. 

In one embodiment, because a method described herein can identify or diagnose 
5 the presence of a non-CNS neoplasia in a subject at an early stage, e.g., before a 

neoplasm has formed, before a neoplasm is clinically detectable, and/or before a tumor 
has become malignant, the dose of a chemotherapeutic agent may be lower than that 
typically used after a neoplasm, e.g., a cancer, is detected or diagnosed by clinical 
methods, such as visualization or palpation of a tumor mass. 

10 

Therapeutic Targets 

A CNS marker gene for a non-CNS disorder, e.g., a CNS marker gene described 
herein, may not only "sense" the presence of the disorder, but also actively participate in 
responding to the presence of the disorder by generating a response, e.g., an antitumor 

1 5 response. Alternatively, a CNS marker gene may respond to the presence of non-CNS 
disorder by promoting progression of the disorder, e.g., inducing growth of a neoplasm or 
promoting malignant transformation of a neoplasm. As a therapeutic strategy, one would 
want to promote the expression or activity of the former type of gene, and/or inhibit the 
expression of activity of the latter type of gene, in the CNS. Thus, regardless of whether 

20 a CNS marker gene generates a response to curb or promote a specific disorder, its 
identification can provide a target for inhibiting progression of the disorder. 

One way to identify such CNS marker genes that are also potential therapeutic 
targets is to identify CNS genes that are differentially expressed in animals that exhibit an 
inhibitory response against a disease compared to animals that do not exhibit an 

25 inhibitory response. For example, experimental animals can be injected with tumor 
inducing cells (e.g., colon cancer cells such as CT26) that express an interleukin (IL), 
e.g., DL-12. Injection of tumor cells genetically modified to express IL-12 is known to 
induce Thl immune mediated tumor rejection (Adris et al., 2000, Cancer Res., 
60(23):6696-703). Control mice can be injected with tumor cells that do not express IL- 

30 12. At different times after injection, gene expression in the CNS is analyzed in the 

animals, as described herein, e.g., by microarray analysis. Thus, genes that "turn off' and 
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"turn on" specifically in the CNS (e.g., brain) of the animals can be identified. Some of 
these genes will respond to the presence of the IL. Others will correspond to genes 
actively engaged in the "stimulation" of the antitumor immune response. This strategy 
can be used for any interleukin gene that may be involved in the stimulation of an 
5 antitumor immune response. Identification of brain genes actively involved in 

"stimulating" an antitumor response will provide a target for therapeutic intervention, 
e.g., by direct use of the gene or its gene product, or by screening for agents that block or 
stimulate their activity. 

A second strategy for identifying CNS genes that are potential therapeutic targets 

10 is by using transgenic animals (e.g., knockout mice) having brain specific disruptions 
(e.g, knockouts) in specific genes. A great number of CNS-specific knockout mice are 
currently available to the skilled artisan (see, e.g, the Jackson Laboratory web site, 
describing numerous JAX® mice models used in neurobiology), and many more can be 
expected to become routinely available. A role in the CNS response to non-CNS disease 

15 can be established for any particular gene for which a brain knockout animal can be 
obtained or produced, by inducing the disorder in the knockout mice (e.g, as described 
herein for cancer, RA, asthma or obesity), and evaluating disease outcome. 

CNS marker genes and gene products that are also potential therapeutic targets are 
listed in FIG. 25A-E. These genes are or encode molecules involved in cell signaling, 

20 (e.g, growth factors, hormones, cytokines and their receptors) and are also differentially 
expressed markers in each of the tumors studied. 

Vaccines 

The methods described herein also provide targets for preventive vaccination. A 
25 set of brain genes that "senses" a disease may include receptors for known or unknown 
ligands. A disease cell might produce these ligands to inhibit the induction of a brain- 
derived anti-disease response. In such an instance, identifying a CNS gene that is 
involved in an anti-disease response can lead to the identification of a gene product 
secreted by the diseased cell that might impact in the brain to inhibit disease response. A 
30 genetic vaccine targeting these products could be a viable therapeutic strategy. 
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One approach to identify CNS targets for preventive vaccination in the treatment 
of non-CNS disorders is the following: obtain a CNS gene expression profile (using 
techniques such as those described herein above) from animals that exhibit an anti- 
disease response, e.g. ( in the case of a tumor, an IL-12 mediated antitumor response, in an 

5 experimental tumor model. It is expected that from the cluster of genes "sensing" the 
tumor, some will change their expression levels in the presence of IL-12. This subset of 
genes will likely be those involved in "generating" the antitumor response. This subset 
of genes is likely to have predictable modulators. For example, if a CNS gene that 
changes its expression profile in response to a non-CNS gene in the presence of IL-12 is a 

1 0 receptor, one could predict that the change in gene expression of such a receptor could be 
brought about by its ligand. Thus, a preventive genetic vaccine could be designed to 
generate a memory response to such a ligand. 

A second experimental approach can involve identifying those CNS genes that 
change their activity in response to a non-tumorigenic dose of tumor cells (e.g., a 

1 5 condition where neoplasia exists in the body, but no neoplasm is yet formed). From this 
subset of CNS genes one can predict the modulating genes responsible for their changes 
in activity, as explained above. Such modulating genes, which may be derived from the 
neoplastic cells, are likely to be initial tumor-derived signals of alarm in the peripheral 
body. Thus, a preventive genetic vaccine could be designed to generate a memory 

20 response to such genes. 

A vaccine can be, e.g., a polypeptide or nucleic acid corresponding to the gene to 
be targeted. Vaccines described herein can be administered, or inoculated, to an 
individual in physiologically compatible solution such as water, saline, Tris-EDTA (TE) 
buffer, or in phosphate buffered saline (PBS). They can also be administered in the 

25 presence of substances (e.g., facilitating agents and adjuvants) that have the capability of 
promoting uptake or recruiting immune system cells to the site of inoculation. Vaccines 
have many modes and routes of administration. They can be administered intradermally 
(ID), intramuscularly (IM), and by either route, they can be administered by needle 
injection, gene gun, or needleless jet injection (e.g., Biojector™, Bioject Inc., Portland, 

30 OR). Other modes of administration include oral, intravenous, intraperitoneal, 

intrapulmonary, intravitreal, and subcutaneous inoculation. Topical inoculation is also 
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possible, and can be referred to as mucosal vaccination. These include, for example, 
intranasal, ocular, oral, vaginal, or rectal topical routes. Delivery by these topical routes 
can be by nose drops, eye drops, inhalants, suppositories, or microspheres. 

The following examples are illustrative only and not intended to be limiting. 

5 

EXAMPLES 

Example 1 : CNS Gene Expression Profiles Associated With Colon Carcinoma 

CNS gene expression profiles associated with the presence of a peripheral tumor 
were identified using gene expression microarray analysis on brain tissue from 

1 0 experimental animals implanted peripherally with tumor cells. This example describes 
the identification of brain gene expression profiles associated with colon carcinoma. 

Male BALB-C mice were injected subcutaneously with 5 x 10 5 CT-26 WT cells, a 
murine colon carcinoma cell line (ATCC cat # : CRL-2638), resuspended in 300 ul of 
PBS, as described below. Control mice were injected with the corresponding volume of 

1 5 PBS following the same procedure. After a specified time, the animals were sacrificed, 
their brains dissected, and first strand cDNA was synthesized from total polyA+ RNA 
prepared from different brain regions, as described in detail below. Gene expression 
microarray analysis was performed with the first strand cDNA by hybridizing to 
preprinted slides (Coming's CMT-GAP™ H Coated Slides) containing Pan® Mouse 10K 

20 Oligo set A (MWG Biotech). This slide set contains probes for 10,000 genes selected 
from mouse genes that have been functionally defined. 

The data from the microarray experiments was analyzed with a Virtek® 
ChipReader® laser scanner model A0-B0-05 (Virtek Vision Corp, Waterloo, ON, 
Canada) using the Virtek ChipReader v2.0 software, as described in more detail below. 

25 

Experimental Methodology 

Cell Lines: The experimental work was based on the following murine cell lines: 
CT26WT colon carcinoma (ATCC cat # : CRL-2638), LL/2(LLC1) lung carcinoma 
(ATCC cat # : CRL-1642) and 4T1 breast carcinoma (ATCC cat # : CRL-2539). All cell 
30 lines were grown in P-100 plates with 10 ml of the corresponding medium. All culture 
media were sterilized by filtration using 0.22 um CA filter. CT-26 cells were grown in 
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DMEM containing 1.5 g/L Sodium Bicarbonate, 10 mM Hepes, and 1 mM Sodium 
pyruvate, supplemented with 10% Fetal Bovine Serum at 37°C with 5% C0 2 . 
LL/2(LLC1) cells were grown in DMEM containing 4.5 g/L Glucose, 1.5 g/L Sodium 
Bicarbonate, 10 mM Hepes, and 1 mM Sodium pyruvate, supplemented with 10% Fetal 

5 Bovine Serum at 37°C with 5% C0 2 . 4T1 cells were grown in RPMI 1640 containing 
4.5 g/L Glucose, 1.5 g/L Sodium Bicarbonate, 10 mM Hepes, and 1 mM Sodium 
pyruvate, supplemented with 10% Fetal Bovine Serum at 37°C with 5% C0 2 . 

In vivo studies: Six week-old animals were housed in an Hepa filtered air rack, 5 
animals per cage (both tumor and control animals in the same cage) with food and water 

10 ad libitum. Balb-C males were injected subcutaneously with 5 x 10 5 CT-26 WT cells 
resuspended in 300 pi of PBS. BALB-C female mice were injected subcutaneously with 
1 x 10 s 4T-1 cells resuspended in 100 ul of PBS. C-57/BL6 male were injected 
subcutaneously with lxlO 6 LL/2(LLC1) cells resuspended in 300 pi of PBS. Control 
animals were injected with the corresponding volume of PBS following the same 

15 procedure. 

For each tumor type 2 different experiments were performed and 3 time points 
evaluated in duplicate. Each single time point corresponded to 15 mice. All injections 
were done using a 27-G syringe. At the corresponding time, mice were kilted by cervical 
dislocation. Mice were immediately decapitated, the brain extracted and dissected using 

20 the following procedure: the hypothalamus and the cerebellum were dissected, the brain 
was cut with a surgical razor blade leaving the right and left hemispheres separated, and 
two persons dissected the midbrain, the hippocampus, the prefrontal cortex and the 
striatum from each brain hemisphere. All brain regions were immediately frozen in dry 
ice and stored at -80°C until RNA extraction. 

25 Preparation of Polv A + RNA : Poly A+ RNA was obtained from total RNA using 

the MicroPoly(A) Pure® kit from Ambion. In general, starting material was 400 ug total 
RNA to which a volume of 5M NaCl was added up to a final concentration of 0.45 M 
NaCl. After mixing, samples were transferred to an RNase-free microfuge tube. After 
adding binding buffer provided by the manufacturer, the RNA was heated for 5 minutes 

30 at 65°C and immediately chilled on ice for 1 minute. Oligo (dT) Cellulose was added to 
the sample, mixed by inversion and incubated for 60 minutes at room temperature with 
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gentle agitation. This was followed by centrifugation at 4,000 rcf for 3 minutes. After 
the supernatant was removed, the pellet was treated with 1 ml binding buffer, mixed and 
spun down by centrifuging at 4,000 rcf for 3 minutes. After removing the supernatant, 
the pellet was washed 3 times with binding buffer followed by 4 washes with wash 
5 buffer. The 01igo(dT) Cellulose was then dissolved in 400 pi of wash buffer provided by 
the manufacturer and transferred to a spin column when the resin was washed 4 more 
times. When the flow-through of the column reached an absorbance of <0.05 OD at 
A260, the mRNA was eluted from the 01igo(dT) Cellulose with 200 ul of Elution Buffer 
(provided by the manufacturer) pre-warmed at 65°C. The eluted polyA+ RNA was 
1 0 concentrated with a mixture containing 20 ul of 5 M Ammonium Acetate, 1 ul Glycogen 
and 550 pi of 100% ethanol. After overnight precipitation at -20°C samples were 
centrimged at 14,000 rcf for 20 minutes at 4°C. After careful removal of the supernatant 
the pellet containing the polyA+ RNA was resuspended in 10 pi of DEPC treated 
Water/EDTA. 

1 5 Labeline of probes for mi croarrav hvhridimtinn ' Labeling was performed by two 

indirect methods. The first method used aminoallyl labeled nucleotides via first strand 
cDNA synthesis using Superscript Reverse Transcriptase followed by coupling of the 
aminoallyl to either Cyanine 3 or 5 (Cy3/Cy5) fluorescent molecules (Amersham 
Pharmacia). To 3 pg of poly(A+) RNA were added 0.6 pi Random Primers (pd (N)6, 

20 Invitrogen) (3 pg/pl) and 1.2 pi Oligo (dT)12-18 (0.5 pg/pl). Milli-Q H 2 0 was added up 
to a final volume of 15.5 pi. The mixture was heated at 65°C for 5 min, chilled on ice 
and spun down. 12.5 pi of a Master Mix containing: 6 pi of 5X First Strand Buffer, 3 pi 
of 100 mM DTT, 0.6 pi of 50X aminoallyl (Sigma Co)-dNTP mix (Amersham 
Pharmacia), 1.5 pi of Rnase OUT (40 units/pl, Invitrogen), 1.4 pi Milli-Q H 2 0 were 

25 added to each tube, incubated at 37°C for 2 minutes, followed by the addition of 2 pi of 
Superscript II RT (Invitrogen). After incubation for 2 hr at 37°C tubes were transferred 
15 min at 70°C. At the end, tubes were spun down. RNA was degraded by the sequential 
addition of 3 pi 2.5 M NaOH incubated at 37°C for 15 min, then 15 pi of 2 M HEPES 
free acid, 4.8 pi 3 M NaAcO (pH 5.2) and finally 150 pi of 100% EtOH. After mixing, 

30 tubes were incubated at -20°C for 1 hr. Tubes were centrimged for 30 min at 4°C, the 
supernatant was removed and the pellet was washed twice in 70% ethanol. The pellet 
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was dissolved in 2.25 nl Milli-Q H 2 0. Coupling of fluorescent Cy3 and Cy5 was 
performed by initially adding 2.25 pi of 0.2 M NaHC03 (pH 9.0) and then 4.5 pi of the 
DMSO/dye mixture to the 4.5 pi cDNA sample. Tubes were mixed well and incubated 
for 1 hr at room temperature in the dark. For probe purification 500 ul of Loading Buffer 

5 were added to the sample and mixed. A SNAP Column (Invitrogen) was placed on a 
collection tube and the sample loaded on the column and incubated at room temperature 
for 2-5 min. The system was centrifuged at maximum speed for 1 min and the flow- 
through was discarded. After two more washes the SNAP column was put back in the 
collection tube and centrifuged at maximum speed for 30 sec to remove residual Wash 

1 0 Buffer from the membrane filter. cDNA was eluted by adding 60 ul TE buffer to the 

SNAP column, incubated for 2-5 and centrifuged at maximum speed at room temperature 
for 1-2 min. After saving the first eluate, the elution was repeated and both samples were 
combined. 

Alternatively, labeling of poly A+ RNA was performed with a Clontech kit 

1 5 following manufacturer instructions. Briefly, to 3 pg of poly(A+) RNA were added 
0.6 pi Random Primers (pd(N)6) (3 pg/pl), 0.5 pi 01igo(dT)12-18 (0.5 pg/pl), and 
deionized H 2 0 up to 25 pi. After heating at 70°C for 5 min the tubes were placed at 
37°C, and then 25 pi of Master Mix were added (10 pi of 5X cDNA Synthesis Buffer, 
5 pi of 10X dNTP Mix, 7.5 pi H 2 0 and 2.5 pi MMLV Reverse Transcriptase 

20 (200 units/pl). Tubes were incubated at 37°C for 1 hr, followed by 5 min at 70°C. After 
few minutes at 37°C, RNA was eliminated by adding 0.5 pi RNase H for 15 min at 37°C 
and then 0.5 pi of 0.5 M EDTA (pH 8.0) together with 5 pi of QuickClean resin. After 
inserting a 0.45-um Spin Filter into the collection tube, the sample was transferred into 
the Spin Filter. The cDNA was concentrated with 3M Sodium Acetate, the addition of 

25 ice-cold 100% ethanol and centrifugation at maximal speed for 20 min at 4°C. The pellet 
was washed once in 70% ethanol, air dried and dissolved in 10 pi 2X Fluorescent 
Labeling Buffer. Fluorescent dye coupling was performed by adding 10 pi of the 
DMSO/dye mixture to 10 pi of the cDNA sample. This mixture was mixed well and 
incubated for 30 min in the dark. 2 pi of 3M Sodium Acetate and 50 pi of 100% ethanol 

30 were added. After 2 hr at -20°C the tube was centrifuged for 20 min. After washing the 
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pellet once in 70% ethanol the pellet was dissolved in 100 ul H 2 0. Probes were purified 
on NucleoSpin columns. 

Quantification of the leve ls of incorporation ofdves and total DNA : The extent of 
dye incorporated was obtained by the absorbance at 550 nm and 650 nm for Cy3- and 
5 Cy5-probes, respectively. The amount of DNA was obtained by the absorbance at 
260 nm. At the end of the entire procedure the amount of total DNA obtained was 
0.34-0.65 ug DNA / 1 ug poly A+ RNA for the Clontech procedure and 0.8 - 1 .2 ug 
DNA / 1 ug poly A+ RNA for the Superscript and indirect labeling procedure. The 
current percentage of dye incorporation was 5 - 15 % in the first case and 7.5 - 20 % in 
10 the second. 

Microarravs and Data Analysis 

Prehybridization: The Prehybridization Buffer (5 ml of 20X SSC Buffer, 0.25 ml 
of 20% SDS, 5 ml of 10% BSAand 24.75 ml of Milli-Q H 2 0) was preheated at 42°C. 

1 5 The printed slide was put in a 50 mi-Falcon polypropylene tube containing the preheated 
prehybridization buffer and incubated at 42°C for 40 min. After washing five times, 1 
min each, with Milli-Q H 2 0 preheated at 42°C in a Wash Station, slides were washed four 
or five times in 2-propanol. The slide was dried by centrifugation for 1 min using a 
Microarray Centrifuge. Cover glasses were washed with Milli-Q H 2 0 and 2-propanol 

20 and dried. Slides were used immediately for hybridization. 

Probe preparation: 2 ug of probe (1 ug Cy3- + 1 ug Cy5-labeled probes) were 
used per slide. This amount represents 100-110 pmoles and 70-80 pmoles of Cy3 and 
Cy5 incorporated dye, respectively. If dye incorporation levels were below that value, 
the amount of nucleotide was increased to reach these values (picomoles of labeled 

25 probe). Probe was concentrated by speedvac to about 20 ul, combined and mixed well. 

Hybridization: Each hybridization mix contained: 20 ul of 4 X Hybridization 
Buffer (Amersham Pharmacia; Cat. No. RPK0325), 24 ul of formamide (final 
concentration, 30 %) and 16 ul of Salmon Sperm DNA (final concentration lug / ul). 
This master mix was added to probes, mixed well, heated at 95°C for 3 min, snap cooled 

30 on ice for 1 min and centrifuged at 1 6.000 x g for 1 min. A pre-hybridized microarray 
slide (array side up) was placed in a hybridization chamber. Two Parafilm strips were put 
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at both sides of the array printed area. Finally, the probe was placed carefully on the top 
of the slide surface followed by the coverslip on top of it (FIG 1). 10 ul of Milli-Q H 2 0 
(20 ul total) was added to the small wells at each end of the chamber to seal the chamber. 
Slides were incubated at 42°C for 16-20 hours under gently mixing in a 3D-rotator. At the 

5 end of the hybridization, the slide was carefully removed and washed with washing 
buffer preheated at 42°C for 5 min with agitation (2 X SSC, 0.1 % SDS). Slides were 
washed twice more in different chambers, each time for 5 minutes (First in 1 X SSC and 
then in 0.1 X SSC). The slide was dried by centrifugation for I min in a microarray 
centrifuge and placed in a light tight slide box until scanning. 

10 Data acquisition and im age processing . The slides were scanned with a Virtek 

ChipReader laser scanner model AO-BO-05 (Virtek Vision Corp, Waterloo, ON, Canada) 
using the Virtek ChipReader v2.0 software. Three images were obtained for each of the 
Cy3 and Cy5 channels with different detector sensitivity values for each slide, with a 
resolution of 10 um and a pixel depth of 16 bits. The images were stored as 16 bit TIFF 

1 5 files (Tagged Image File Format) and analyzed with Virtek ChipReader v2 .0 software. 
The results were stored in plain text files with the following fields separated by 
tabulations: GridName, Column#, Row#, CentroidX, CentroidY, SNR, Signal Average, 
Signal Median, Signal Std, Signal pixels, Background Average, Background Median, 
Background Std. 

20 D^a filtrati on and normalization : All the data processing was performed under 

the R System vl.6.2 (R Development core Team™ software). The data was filtered to 
eliminate background data points (spots with size less than 75 pixels or with a mean to 
median correlation less than 80% (Tran and Peiffer, 2002, Nucleic Acids Res. 30(12), 
e54), to eliminate saturated data points (spots with a proportion of saturated pixels greater 

25 than 20%), and to eliminate low signal data points (spots with signal to noise ratio below 
2). The signal was corrected for background and the signal volume was estimated as 
(Signal Average - Background Average) x Signal pixels. The base 2 logarithm of the 
ratio and the product between Cy5 and Cy3 was calculated as: 

30 Af=log2(Cy5/Cy3) (1) 
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^ = log2(Cy5xCy3) (2) 

Data was normalized using locally weighted linear regression of M vs. A over 
rank consistency filtered data for each print tip (rank consistent print tip lowest fit) for 
150 or more data points, or using a rank consistent global lowest fit algorithm when the 
number of data points was below 150. 

For the rank consistency filter, the rank of Cy3 and Cy5 intensities of each gene 
on the slide were separately calculated. For a given gene, if the ranks of Cy3 and CyS 
intensities differed by less of a threshold value d, this gene was classified as rank 
consistent. This process was iteratively repeated until the number of rank consistent genes 
did not change. The threshold level d was defined as follows: 



d=p- (1+1/0 &(A)-n (3) 

where i is the iteration number; n is the number of data points, which is equal to 
the number of spots in the slide for the first iteration cycle, and equal to the number of 
rank consistent genes defined in cycle i-1 for the next cycles; fh(A) is the normal density 
function with mean equal to the average of A and standard deviation (SD) equal to the 
estimated SD of A; and p is a proportionality constant that was set to 0.5. 

Then, the value of M for each value of A that follows the central tendency of the 
data (Mc) was estimated from the rank consistency data with the R package lowest 
function, and it was subtracted from the empirical M value to obtain the normalized M 
data (M'). From this point, all further analysis was performed with the normalized M' 
data. 

Outlier data points were eliminated from the triplicate data with a leave-one-out 
algorithm. Briefly, a data point was discarded as being outlier if it was outside the 
confidence interval defined by the other two data points with a confidence level of 95%. 

A gene expression data set was generated for each slide with the average of non- 
outlier data points. 

Differences in scale between Cy3 and Cy5 channels can lead to an asymmetric 
distribution of M' data. To correct this deviation, M' data was transformed to be normally 
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distributed. First, a uniform distributed data set between 0 and 1 was obtained with the 
transformation Mu = rank(Af)/(" + l), where n is the number of data points. Then, a 
normal distributed data set, with mean equal to the median of M' and SD equal to the 
estimated SD of AT, was obtained with the transformation: Mn = qnorm(A/ M ), where 
qnorm is the normal quantile function included in the R package. 

Data integration between replicated slides Each labeled probe was hybridized at 
least twice. If the scale (i.e. variance) between replicated slides was different (p < 0.05, 
Fligner-Killeen test for homogeneity of variances), data were transformed to be equally 
scaled. Assuming that the ratios follow a normal distribution with mean zero and variance 
a,2a2, we estimated a, as follows: 



with / denoting the total number of slides, and the median absolute deviation MAD 
defined by, 



where My denotes the f spot in the i' h slide. 

Outlier data points were eliminated from three or more replicated data sets with a 
leave-one-out algorithm as described above, and an integrated data set was obtained with 
the averaged M values from non-outlier data. 

Analysis and integration of rep licated experimental data and noise analysis: 
Experiments were performed at least twice. If the scale (i.e. variance) between replicated 
experiments was different (p < 0.05, Fligner-Killeen test for homogeneity of variances), 
data were transformed to be equally scaled. 

Outlier data points were eliminated from three or more replicated data sets with a 
leave-one-out algorithm. The arithmetic mean (Mn) and SD were estimated from the non- 
outlier data set. A noise sampling method was used for p-value estimation (Draghici et 
al., Noise sampling method: an ANOVA approach allowing robust selection of 




(4) 



MAD = median, \m :j - median,. J (5) 
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differentially regulated genes measured by DNAmicroarrays, Bioinfomatics, in press). 
Briefly, an estimation of the noise is obtained from the replicated data as the difference 
between the ratio expression for gene g in experiment e and the mean for gene g among 
experiments. Because noise varies with intensity: low intensity spots tend to have more 
5 noise than high intensity ones, the intensity range was divided in bins, and noise 
distributions constructed for each such bin. Assuming that the noise distribution is 
normal, which was the case for most experiments, it was mapped from the distribution of 
the noise to the distribution of the log ratios by the scaling factor 1 / sqrt ( n - 0.5 ), where 
n is the number of replicates. 

10 Cluster analysis: Before cluster analysis, the data was scaled as follows: 

Ms = (M - Mn(M)) / SD(M). A figure of merit algorithm (Yeung et al, 2001, 
Bioinformatics, 17(4):309-18) was used to identify the clustering algorithm and the 
number of clusters that minimized the intra-cluster variability. After examining the figure 
of merit of all the datasets analyzed with seven different clustering algorithms and 

1 5 different variations of such algorithms that led to a total of 51 different clustering 
methods, it was decided to subset the data in 12 clusters with a hierarchical algorithm 
using euclidean distance between gene expression patterns and a Ward's minimum 
variance agglomeration method (Hartigan (1975). Clustering Algorithms New York: 
Wiley). Genes with similar expression patterns among the experiments were clustered 

20 together using routine hierarchical clustering techniques. 
Results 

After quality filtering and normalizing the microarray data, sequences with a p- 
value below 0.05 were identified as differentially expressed (DE). Further analysis was 
performed on this subset of sequences to select and cluster sequences according to 

25 specific criteria. Genes with similar expression patterns among the experiments were 
clustered together using hierarchical clustering techniques as described above. 

Cluster analysis T: A first clustering analysis (cluster analysis I) identified DE 
sequences (p < 0.05) up- or down-regulated in one of two experimental time points tested 
(72 and 192 hours). These are shown for the colon cancer model in FIG. 2 (DE 

30 sequences in midbrain), FIG. 8 (DE sequences in cortex), FIG. 1 4 (DE sequences in 

striatum), and FIG. 18 (DE sequences in hypothalamus). Cluster graphs (the last sheet of 
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each Figure) show whether genes in a particular cluster were up- or down-regulated (y- 
axis) at each time point tested (x-axis). For example, in FIG. 2-8, clusters 4- 6 were up- 
regulated at 72 hours (x-axis 1 .0) and clusters 9-12 were up-regulated at 192 hours (x- 
axis 2.0). In FIG. 8-13, clusters 9 and 10 were down-regulated at 192 hours while 

5 clusters 1 1 and 12 were up-regulated at 1 92 hours. In FIG. 14-8, clusters 4, 5 and 6 are 
close to midline at 72 hours and down-regulated at 192 hours. FIG. 18-8 shows clusters 
11 and 12 up-regulated at 192 hours. 

Cluster analysis II: A more stringent clustering analysis (cluster analysis II) 
revealed DE sequences (p < 0.05) up- or down-regulated in both experimental time points 

10 tested. These are shown in FIG. 3 (DE sequences in midbrain), FIG. 9 (DE sequences in 
cortex), FIG. 15 (DE sequences in striatum), and FIG. 19 (DE sequences in the 
hypothalamus). For example, FIG. 3-2 shows that only cluster 7 is up-regulated at both 
time points, while the rest of the clusters are down-regulated at both time points. 
FIG. 19-2 shows that only cluster 4 is up-regulated at both time points. 

15 Secreted markers: In a third analysis, the filtered data were reclustered to select 

sequences that should correspond to a secreted product and have a p value for differential 
expression below 0.05 (p<0.05). The results of this analysis for colon cancer is shown in 
FIG. 24(A). FIG. 24(A) lists markers corresponding to secreted products, that were 
differentially expressed in the colon cancer model at any time point studied. Secreted 

20 markers are particularly useful in that their expression can be detected in cerebral or 
cerebrospinal fluid, avoiding the need for a solid tissue biopsy. 

Shared DE markers: In a final analysis, the filtered data were reclustered to select 
sequences that were differentially expressed in all tumors analyzed. Seven DE sequences 
were found to be shared among all three carcinomas studied. These were hepatocyte 

25 growth factor (HGF), apherin A3, chemokine (C-C motif) ligand 4, growth differentiation 
factor-9b (GDF-9b); bone morphogenetic protein 15 (BMP 15), neuroblastoma 
suppressor of tumorigenicity 1, melanocyte proliferating gene 1, and fibroblast growth 
factor 22 (FGF 22). (FIG. 26) 
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Example 2: CNS Gene Expression Pr ofile Associated With Breast fW.inm™ 

This example describes the identification of brain gene expression profiles 

associated with breast carcinoma. 

BALB-C mice were injected subcutaneously with 1 x 10 5 4T-1 breast carcinoma 
5 cells (ATCC cat # : CRL-2539) resuspended in 100 ul of PBS. All experimental 

methods, microarrays and data analysis were otherwise performed as described above for 

Example 1. 

Results 

Quality filtering, normalization and analysis of the microarray data was 

10 performed as discussed above. 

Cluster analysis I: a first clustering analysis (cluster analysis I) identified DE 
sequences (p < 0.05) up- or down-regulated in only one of the three experimental time 
points tested (18, 72 and 192 hours). These are shown for the breast cancer model in 
FIG. 4 (DE sequences in midbrain), FIG. 10 (DE sequences in cortex), and FIG. 20 (DE 

15 sequences in hypothalamus). For example, FIG. 4-16 shows that the genes of cluster 1, 2 
and 3 were up-regulated at 72 hours (x-axis 2.0). In FIG. 10-13, clusters 2 and 3 show 
up-regulation at 72 hours. In Fig. 20-21, only clusters 5-7 are down-regulated at 192 
hours (x-axis 3.0). 

Cluster analysis fl: a more stringent clustering analysis (cluster analysis II) 
20 revealed DE sequences (p < 0.05) up- or down-regulated in at least two of the three 
experimental time points. These are shown in FIG. 5 (DE sequences in midbrain), 
FIG. 1 1 (DE sequences in cortex), and FIG. 21 (DE sequences in hypothalamus). In 
FIG. 5, only cluster 7 shows up-regulation at any time point, while the remaining clusters 
are generally down-regulated. Similarly, only one cluster (cluster 6) is up-regulated in 
25 FIG. 1 1 -6. Only clusters 1 0- 1 2 are up-regulated in FIG. 21-7. 

Secreted markers: in a third analysis, the filtered data were reclustered to select 
sequences that correspond to a secreted product and have a p value for differential 
expression below 0.05 (p<0.05). The results of this analysis for breast cancer is shown in 
FIG. 24B. FIG. 24B lists markers corresponding to secreted products, that were 
30 differentially expressed in the breast cancer model at any time point studied. 
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Example 3: CNS Gen e Expression Profile Associated With Lung Carcinoma 

This example describes the identification of brain gene expression profiles 

associated with lung carcinoma. 

Male C-57/BL6 mice were injected subcutaneously with lxl 0 6 lung carcinoma 
5 LL/2(LLC1) cells (ATCC cat # : CRL-1642) resuspended in 300 ul of PBS. All 

experimental methods, microarrays and data analysis were otherwise performed as 

described above for Example 1. 
Results 

Quality filtering, normalization and analysis of the microarray data was 

10 performed as discussed above. 

Cluster analysis I: a first clustering analysis (cluster analysis I) identified DE 
sequences (p < 0.05) up- or down-regulated in only one of the three experimental time 
points tested (1 8, 72 and 192 hours). These are shown for the lung cancer model in 
FIG. 6 (DE sequences in midbrain), FIG. 12 (DE sequences in cortex), FIG. 16 (DE 

1 5 sequences in striatum), and FIG. 22 (DE sequences in hypothalamus). For example, 
FIG. 6-15 shows that clusters 2, 3 and 4 are up-regulated at 72 hours (x-axis 2.0) while 
clusters 5-1 1 are up-regulated at 18 hours (x-axis 1.0). FIG. 12-13 shows that clusters 2, 
3 and 4 are up-regulated at 72 hours. In FIG. 22-21, only clusters 3, 4, 8 and 9 are down- 
regulated at 192 hours. 

20 Cluster analysis IT: a more stringent clustering analysis (cluster analysis II) 

revealed DE sequences (p < 0.05) up- or down-regulated in at least two of the three 
experimental time points. These are shown in FIG. 7 PE sequences in midbrain), 
FIG. 13 (DE sequences in cortex), FIG. 17 (DE sequences in striatum), and FIG. 23 (DE 
sequences in hypothalamus). In FIG. 7-5 and FIG. 17-5, all the clusters, except for 

25 cluster 12 in each set, are down-regulated at every time point studied. FIG. 13-6 shows 
that all but two clusters (1 1 and 12) were down-regulated. 

Secreted markers: in a third analysis, the filtered data were reclustered to select 
sequences that correspond to a secreted product and have a p value for differential 
expression below 0.05 (p<0.05). The results of this analysis for lung cancer is shown in 

30 FIG. 24C. FIG. 24C lists markers corresponding to secreted products, that were 
differentially expressed in the lung cancer model at any time point studied. 
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Example 4: Diagnosis of Cancer in a Human hv D etecting a Gen* Product Pmfil* 

This example describes a diagnostic test for non-CNS carcinoma performed on a 
human subject. The subject is a carrier of the BRCA1 breast cancer susceptibility gene. 

A CSF sample is obtained from the subject by means of a lumbar puncture. This 
procedure is done on an outpatient basis under local anesthetic. The CSF sample is used 
immediately in the diagnostic assay, or is cooled or frozen and stored or transported to a 
facility where the diagnostic test is performed. 

The diagnostic test involves contacting the CSF sample to an antibody array 
containing a panel of 25 antibodies that can detect a set (cluster) of CNS gene products 
that are associated with the presence of breast cancer when secreted in a characteristic 
profile in the CSF. The panel includes antibody probes for one or more CNS markers for 
breast carcinoma listed in FIG. 24B. Thus, in this example, the characteristic profile is 
the CNS "reference profile" for breast carcinoma. 

The results of the antibody array are obtained by routine techniques, such as 
fluorescence detection and measurement of bound antibody vs. unbound antibody for 
each position (each antibody) on the array. A dataset of the value for the level of each 
polypeptide detected in the CSF sample by each antibody on the array is generated. The 
dataset is used directly as the test expression profile or the dataset is converted into a 
format comparable to the format of the reference profile to which the test profile is 
compared. 

Once the test expression profile is generated, the test profile is compared to the 
reference expression profile. In this example, the reference profile is a dataset that 
includes relative values of expression for a panel of 10 CNS gene products secreted into 
the CSF, all of which are known to be down-regulated at least 30%, on average, in 
subjects who have early stage breast cancer. The gene products include one or more of 
the gene products shown in FIG. 4, 10, 20, 5, 1 1, 21 or 24B. If the test profile shows that 
7 or more of the genes in the panel are down-regulated by at least 20% in the test sample, 
the test profile matches the reference profile and the subject is determined to have (or be 
at risk for) early stage breast cancer. 



62 



1 5 138-003P0 1/ ER/CR-1 3. 1 50 



OTHER EMBODIMENTS 

A number of embodiments of the invention have been described. Nevertheless, it 
will be understood that various modifications may be made without departing from the 
spirit and scope of the invention. Accordingly, other embodiments are within the scope 
of the following claims. 
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We claim: 

1 . A method of diagnosing a non-central nervous system (non-CNS) disorder in a 
subject, the method comprising: 

5 detecting expression of a gene in a CNS sample of the subject, and 

correlating the result of the detecting step to the presence or absence of a non- 
CNS disorder. 

2. The method of claim 1, further comprising the step of obtaining the CNS sample. 

10 

3. The method of claim 1 , wherein the CNS sample is one or more brain cells. 

4. The method of claim 3, wherein the brain cells are selected from the group 
consisting of cells from: the hypothalamus, the midbrain, the prefrontal cortex and the 

15 striatum. 

5 . The method of claim 1 , wherein the CNS sample is cerebrospinal fluid. 

6. The method of claim 1, wherein the non-CNS disorder is selected from the group 
20 consisting of: cancer, rheumatoid arthritis, asthma, diabetes and obesity. 

7. The method of claim 1 , wherein the non-CNS disorder is a carcinoma. 

8. The method of claim 1 , wherein the non-CNS disorder is a solid tumor less than 
25 0.5 cm in diameter. 

9. The method of claim 1 , wherein the gene encodes a gene product selected from 
the group consisting of: a hormone, a growth factor, an immune system component, a 
cytokine. 

30 



64 



15138-003P01/ ER/CR-13.150 



10. The method of claim 7, wherein the gene encodes a gene product listed in any of 
FIGS. 2-26, or a human or other mammalian homolog thereof. 

1 1. The method of claim 7, wherein the gene encodes a gene product selected from 
5 the group consisting of: hepatocyte growth factor (HGF), apherin A3, chemokine (C-C 

motif) ligand 4, growth differentiation factor-9b (GDF-9b); bone morphogenetic protein 
15 (BMP 15), neuroblastoma suppressor of tumorigenicity 1, melanocyte proliferating 
gene 1, and fibroblast growth factor 22 (FGF 22). 

10 12. The method of claim 1, wherein detecting expression of the gene comprises 
detecting the mRNA corresponding to the gene. 

13. The method of claim 1, wherein detecting expression of the gene comprises 
detecting a polypeptide product encoded by the gene. 

15 

14. The method of claim 1, wherein detecting comprises detecting expression of a 
plurality of genes in a CNS sample of the subject 

15a. The method of any one of the preceding claims, wherein the detecting step 
20 comprises performing a microarray assay. 

1 5b. The method of claim 1 , wherein the subject is a human. 

16. A method of diagnosing a non-central nervous system (non-CNS) disorder in a 
25 subject, the method comprising: 

obtaining a test gene expression profile for two or more CNS genes from the 
subject; and 

comparing the test gene expression profile with a reference gene expression 
profile associated with the presence of a non-CNS disorder, wherein a test gene 
30 expression profile that matches the reference gene expression profile indicates the subject 
has a non-CNS disorder. 
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17. The method of claim 16, further comprising generating a record of the result of 
the comparing step; and optionally transmitting the record to the subject, health care 
provider or other party. 

18. The method of claim 16, wherein the non-CNS disorder is selected from the group 
consisting of: cancer, rheumatoid arthritis, asthma, diabetes and obesity. 

19. The method of claim 16, wherein the non-CNS disorder is a carcinoma. 

20. The method of claim 1 6, wherein the non-CNS disorder is a solid tumor less than 
0.5 cm in diameter. 



21. The method of claim 1 6, wherein at least one of the two or more CNS genes is 
1 5 selected from the group consisting of: a hormone, a growth factor, an immune system 

component, and a cytokine. 

22. The method of claim 19, wherein at least one of the two or more CNS genes 
encodes a gene product listed in FIGS. 2-26, or a human or other mammalian homolog 

20 thereof. 

23. The method of claim 19, wherein at least one of the two or more CNS genes 
encodes a gene product selected from the group consisting of: hepatocyte growth factor 
(HGF), apherin A3, chemokine (C-C motif) ligand 4, growth differentiation factor-9b 

25 (GDF-9b); bone morphogenetic protein 15 (BMP 15), neuroblastoma suppressor of 
tumorigenicity 1, melanocyte proliferating gene 1, and fibroblast growth factor 22 
(FGF22). 

24. The method of claim 1 6, wherein the step of obtaining the test gene expression 
30 profile comprises detecting mRNA corresponding to the two or more CNS genes. 



66 



1 5 1 38-003PO 1/ ER/CR-1 3. 1 50 



10 



20 



25 



30 



25. The method of claim 16, wherein the step of obtaining the test gene expression 
profile comprises detecting polypeptide products encoded by the two or more CNS genes. 



26. The method of claim 16, comprising obtaining a test gene expression profile for 
5 plurality of CNS genes. 

27. The method of any one of claims 16-26, wherein the step of obtaining the test 
gene expression profile comprises performing a microarray assay. 



a 



28. A method of treating a subject, the method comprising: 

diagnosing a non-central nervous system (non-CNS) disorder according to the 
method of claim 1 or 16; and 

administering to the subject a therapeutic agent for the disorder. 



15 29. 



The method of claim 28, wherein the therapeutic agent is chemotherapeutic agent. 



30. The method of claim 29, wherein the chemotherapeutic agent is selected from the 
group consisting of: an antitubulin/antimicrotubule drug, a topoisomerase I inhibitor, an 
antimetabolite, and an alkylating agent. 

31. A method of identifying a diagnostic marker for a non-central nervous system 
(non-CNS) disorder in a human, the method comprising: 

inducing a non-CNS disorder in a test experimental animal; 

comparing expression of a gene in a CNS sample from the test experimental 
animal to expression of the gene in a CNS sample from a control experimental animal; 
and 

selecting as a diagnostic marker a human homolog of a gene that is differentially 
expressed in the CNS sample from the test experimental animal compared to the CNS 
sample from the control experimental animal. 
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32. The method of claim 3 1 , wherein a non-CNS neoplasm is induced by chemical or 
radiation mutagenesis. 

33. The method of claim 31, wherein a non-CNS neoplasm is induced by 
5 administering a neoplastic cell to the experimental animal. 

34. The method of claim 31, wherein the experimental animal is an animal model of 
rheumatoid arthritis, diabetes, asthma, obesity or diabetes. 

10 35. The method of claim 31, wherein the experimental animal is a non-human 
primate. 

36. The method of claim 1, 16 or 28, wherein the subject lacks a clinical sign of a 
disorder as evaluated by imaging analysis. 



15 



37. The method of claim 1, 16 or 28, wherein the subject has a family history of the 
disorder. 



38. The method of claim 1, 16 or 28, wherein the subject is a carrier of a gene 
20 associated with increased the disorder. 

39. The method of claim 38, wherein the subject is a carrier of the BRCA1, BRCA2, 
hMSH2, hMLHl, or hMSH6 gene. 



25 
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Figure 24A 
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Figure 24B 
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Figure 24C 



■ DIFFERENTIALLY EXPRESSED GENES (SELECTED) 
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FIGURE 25A 

DRUG TARGETING CANDIDATES 
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FIGURE 26 

CNS markers differentially expressed in all tumors analyzed 
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ABSTRACT 

The invention features methods and compositions for diagnosing non-central 
nervous system (non-CNS) disorders by detecting changes in gene expression in the 
CNS. 
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